January, 2019

article thumbnail

Managing Database Access Control For Teams With strongDM

Data Engineering Podcast

Summary Controlling access to a database is a solved problem… right? It can be straightforward for small teams and a small number of storage engines, but once either or both of those start to scale then things quickly become complex and difficult to manage. After years of running across the same issues in numerous companies and even more projects Justin McCarthy built strongDM to solve database access management for everyone.

article thumbnail

Detecting Performance Anomalies in External Firmware Deployments

Netflix Tech

by Richard Cool Netflix has over 139M members streaming on more than half a billion devices spanning over 1,700 different types of devices from hundreds of brands. This diverse device ecosystem results in a high dimensionality feature space, often with sparse data, and can make identifying device performance issues challenging. Identifying ways to scale solutions in this space is vital as the ecosystem continues to grow both in volume and diversity.

article thumbnail

Open Data Science and Machine Learning for Business with Cloudera Data Science Workbench on HDP

Cloudera

It’s official – Cloudera and Hortonworks have merged , and today I’m excited to announce the availability of Cloudera Data Science Workbench (CDSW) for Hortonworks Data Platform (HDP). Trusted by large data science teams across hundreds of enterprises —. Western Union and IQVIA to name just a couple — CDSW is now also ready to help Hortonworks customers accelerate the delivery of new data products through secure, collaborative data science at scale.

article thumbnail

Who Was Smarter, Karl Benz or Sigmund Freud?

Teradata

David Socha compares Karl Benz and Sigmund Freud, two people that fundamentally and indisputably influenced how we live today.

75
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Aarhus Engineering Internship: Building Aggregation Support for YQL, Uber’s Graph Query Language for Grail

Uber Engineering

Lau Skorstengaard is a Ph.D. student at Aarhus University who pursued a 2018 internship with Uber Engineering’s Aarhus, Denmark office. In this article, Lau discusses his path to Uber and the technical challenges faced while building his internship project as … The post Aarhus Engineering Internship: Building Aggregation Support for YQL, Uber’s Graph Query Language for Grail appeared first on Uber Engineering Blog.

article thumbnail

Quality Conversations

Pandora Engineering

Photo credit: Stewart Sutton | DigitalVision via Getty Images You’re in a maze of twisty little passages, all alike The state of the art in online computer gaming circa 1976 was a game called Adventure , which consisted of typing short instructions into a computer terminal and getting terse responses, which you had to interpret to complete a vaguely-defined mission.

More Trending

article thumbnail

Improving Experimentation Efficiency at Netflix with Meta Analysis and Optimal Stopping

Netflix Tech

By Gang Su & Ian Yohai From living rooms in Bogota, to morning commutes in Tokyo, to beaches in Los Angeles and dorms in Berlin, Netflix strives to bring joy to over 139 million members around the globe and connect people with stories they’ll love. Every bit of the customer experience is imbued with innovation, right from the very first encounter with Netflix during the signup process?

article thumbnail

The New Cloudera

Cloudera

A new year is always an opportunity for change. This year, we’re making a big one. On January 3, we closed the merger of Cloudera and Hortonworks — the two leading companies in the big data space — creating a single new company that is the leader in our category. We are well positioned to deliver even more innovation and success than we have independently over the last decade.

Hadoop 75
article thumbnail

How Data Privacy Can Be Good for Your Business

Teradata

Regulations like GDPR are an opportunity for many organizations, Reiner Kappenberger explains how data privacy can be good for your business.

Data 63
article thumbnail

How OCR Can Help Employees Fight Through Most Mundane Tasks

InData Labs

These days, office employees need an AI hero. Can you imagine the number of hours wasted on handling a paper-based workflow? Isn’t it time to save employees from piles of paper? No one is saying it will be easy to eliminate paper documents promptly. For instance, in the legal sphere where the cost of a. Запись How OCR Can Help Employees Fight Through Most Mundane Tasks впервые появилась InData Labs.

IT 52
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Keeping Pace with New iOS Releases

Pandora Engineering

How We Updated Pandora on iOS 12 Launch Day Photo Credit: Stavros Constantinou The Story That Shook the Press The Pandora app was amongst the very few enterprise apps that successfully released an update for iOS 12 on Apple’s day one September 21 launch date, supporting the exciting new Siri Shortcuts feature. Here are some notable quotes: Engadget, “Music app Pandora is taking advantage of Shortcuts at iOS 12’s launch.

article thumbnail

TimescaleDB: The Timeseries Database Built For SQL And Scale - Episode 65

Data Engineering Podcast

Summary The past year has been an active one for the timeseries market. New products have been launched, more businesses have moved to streaming analytics, and the team at Timescale has been keeping busy. In this episode the TimescaleDB CEO Ajay Kulkarni and CTO Michael Freedman stop by to talk about their 1.0 release, how the use cases for timeseries data have proliferated, and how they are continuing to simplify the task of processing your time oriented events.

Database 100
article thumbnail

The Product Playbook

Zalando Engineering

Shared language and visualizing to deliver great products *Football is an environment with changing variables that players and coaches need to react to. Teams attempt to move the ball down the field by running or passing in a set number of plays. *If you’ve ever watched a football game you will see coaches holding a subset of plays from the coach’s playbook they think may work for the game they are playing.

article thumbnail

Big Data Fabric Weaves Together Automation, Scalability, and Intelligence

Cloudera

Today’s data landscape is characterized by exponentially increasing volumes of data, comprising a variety of structured, unstructured, and semi-structured data types originating from an expanding number of disparate data sources located on-premises, in the cloud, and at the edge. In conjunction with the evolving data ecosystem are demands by business for reliable, trustworthy, up-to-date data to enable real-time actionable insights.

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Five Challenges to Building Models with Relational Data

Teradata

Ben MacKenzie reflects on some of the unique challenges to building models with relational data.

article thumbnail

Live Dashboards with Redash and Rockset

Rockset

Redash is a powerful open source query and visualization tool that helps you make sense of your data. It connects to variety of data sources and also includes a native connector for Rockset. In this post we will demonstrate how to use Redash to build live dashboards on Rockset data sets. Configure If you've never used Redash before, you need to set it up first.

SQL 40
article thumbnail

Performing Fast Data Analytics Using Apache Kudu - Episode 64

Data Engineering Podcast

Summary The Hadoop platform is purpose built for processing large, slow moving data in long-running batch jobs. As the ecosystem around it has grown, so has the need for fast data analytics on fast moving data. To fill this need the Kudu project was created with a column oriented table format that was tuned for high volumes of writes and rapid query execution across those tables.

article thumbnail

How to Fill Your AI Talent Gap

Teradata

Atif Kureishy explores how to fill the artificial intelligence skills gap.

49
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Using Data to Answer the Key Challenge to Enterprise Reinforcement Learning

Teradata

Applying deep reinforcement learning to real world problems has the potential to revolutionize how businesses tackle many of their core business challenges.

Data 45
article thumbnail

What Happened to Big Data?

Teradata

The definition of insanity is doing the same thing over and over and expecting different results.

article thumbnail

Five Challenges to Building Models with Relational Data

Teradata

Ben MacKenzie reflects on some of the unique challenges to building models with relational data.

article thumbnail

Nakadi Goes to FOSDEM

Zalando Engineering

Nakadi is Zalando’s open source event streaming platform. It is based on Apache Kafka. It started as a simple HTTP proxy, providing a REST interface to publish and consume JSON messages. It quickly evolved, with the addition of schema validation and evolution, self-service authorization, a subscription API for easy consumption, deep integration with Zalando’s infrastructure, a SQL-over-streams engine, and much more.

Scala 40
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Running Fast SQL on DynamoDB Tables

Rockset

Have you ever wanted to run SQL queries on Amazon DynamoDB tables without impacting your production workloads? Wouldn't it be great to do so without needing to set up an ETL job and then having to manually monitor that job? In this blog, I will discuss how Rockset integrates with DynamoDB and continuously updates a collection automatically as new objects are added to a DynamoDB table.

SQL 40
article thumbnail

A Day in the Life of a Frontend Engineer at Zalando

Zalando Engineering

You’ve probably never had the same day twice at your current job. At Zalando it’s no different. Here, it not only depends on the product you're currently working on but also on your peers. Actually, what's expected from a frontend engineer can vary according to a company philosophy or your own previous experience: usually a frontend engineer can be seen as a Swiss army knife when in reality at Zalando, for example, we see them as masters of trades.

article thumbnail

How Painful is it (Really) to Switch Cloud Providers?

Teradata

Ron Luebke discusses the pains of switching cloud providers.

Cloud 40
article thumbnail

How Painful is it (Really) to Switch Cloud Providers?

Teradata

Ron Luebke discusses the pains of switching cloud providers.

Cloud 40
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Rockset adds Excel spreadsheet support: Use SQL across XLSX files and join with other JSON, CSV or Parquet data

Rockset

An incredible amount of business data is floating around in Excel spreadsheets - so data scientists often need to analyze data across multiple worksheets or even multiple spreadsheets using SQL. Additionally, this data may need to be joined with other data sets that are in JSON, CSV or Parquet formats. Microsoft Excel currently has some basic SQL support in place: Use SQL for connecting to an external database like Access or SQL Server, parsing field or table contents and importing the data.

SQL 40
article thumbnail

Using Data to Answer the Key Challenge to Enterprise Reinforcement Learning

Teradata

Applying deep reinforcement learning to real world problems has the potential to revolutionize how businesses tackle many of their core business challenges.

Data 40
article thumbnail

How to Do Data Science Using SQL on Raw JSON

Rockset

This post outlines how to use SQL for querying and joining raw data sets like nested JSON and CSV - for enabling fast, interactive data science. Data scientists and analysts deal with complex data. Much of what they analyze could be third-party data, over which there is little control. In order to make use of this data, significant effort is spent in data engineering.

article thumbnail

Enterprise Opportunities to Apply Reinforcement Learning & AI

Teradata

Reinforcement learning is the machine learning approach that is behind some of the most talked about advances in AI.

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.