Sat.Jul 10, 2021 - Fri.Jul 16, 2021

article thumbnail

Tyrannical Data and Its Antidotes in the Microservices World

Confluent

Data is the lifeblood of so much of what we build as software professionals, so it’s unsurprising that operations involving its transfer occupy the vast majority of developer time across […].

IT 141
article thumbnail

Delivering Modern Enterprise Data Engineering with Cloudera Data Engineering on Azure

Cloudera

After the launch of CDP Data Engineering (CDE) on AWS a few months ago, we are thrilled to announce that CDE, the only cloud-native service purpose built for enterprise data engineers, is now available on Microsoft Azure. . CDP Data Engineering offers an all-inclusive toolset that enables data pipeline orchestration, automation, advanced monitoring, visual profiling, and a comprehensive management toolset for streamlining ETL processes and making complex data actionable across your analytic team

article thumbnail

Customer Support Automation Platform at Uber

Uber Engineering

High Level Overview of the Problem. Introduction. If you’ve used any online/digital service, chances are that you are familiar with what a typical customer service experience entails: you send a message (usually email aliased) to the company’s support staff, fill … The post Customer Support Automation Platform at Uber appeared first on Uber Engineering Blog.

article thumbnail

Low Code And High Quality Data Engineering For The Whole Organization With Prophecy

Data Engineering Podcast

Summary There is a wealth of tools and systems available for processing data, but the user experience of integrating them and building workflows is still lacking. This is particularly important in large and complex organizations where domain knowledge and context is paramount and there may not be access to engineers for codifying that expertise. Raj Bains founded Prophecy to address this need by creating a UI first platform for building and executing data engineering workflows that orchestrates

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Create a Data Analysis Pipeline with Apache Kafka and RStudio

Confluent

In Data Science projects, we distinguish between descriptive analytics and statistical models running in production. Overall, these can be seen as one process. You start with analyzing historical data to […].

article thumbnail

Accelerate Offloading to Cloudera Data Warehouse (CDW) with Procedural SQL Support

Cloudera

Did you know Cloudera customers, such as SMG and Geisinger , offloaded their legacy DW environment to Cloudera Data Warehouse (CDW) to take advantage of CDW’s modern architecture and best-in-class performance? In addition to substantial cost savings upon moving to CDW, Geisinger is also able to search through hundreds of million patient note records in seconds providing better treatment to their patients.

More Trending

article thumbnail

Exploring The Design And Benefits Of The Modern Data Stack

Data Engineering Podcast

Summary We have been building platforms and workflows to store, process, and analyze data since the earliest days of computing. Over that time there have been countless architectures, patterns, and "best practices" to make that task manageable. With the growing popularity of cloud services a new pattern has emerged and been dubbed the "Modern Data Stack" In this episode members of the GoDataDriven team, Guillermo Sanchez, Bram Ochsendorf, and Juan Perafan, explain the combination

Designing 100
article thumbnail

Data Engineers of Netflix?—?Interview with Kevin Wylie

Netflix Tech

Data Engineers of Netflix?—?Interview with Kevin Wylie This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Kevin Wylie is a Data Engineer on the Content Data Science and Engineering team. In this post, Kevin talks about his extensive experience in content analytics at Netflix since joining more than 10 years ago.

article thumbnail

DIA Entries 2021: Judges’ Insight

Cloudera

The 2021 Data Impact Award (DIA) submissions are starting to stream in, and we know many of you are contemplating your entries – which we are excited to see. To help guide your award strategy, we thought it would be an excellent opportunity to ask our judges — a panel comprised of leading analysts and journalists well-versed in the application of data and the wider benefits it can bring across industries – what it takes for a winning project.

article thumbnail

How to build a successful cloud data architecture

DataKitchen

The post How to build a successful cloud data architecture first appeared on DataKitchen.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

The Post-Pandemic Supply Chain: Time to Go Back to Basics?

Teradata

Learn how complexities baked into the data analytics ecosystems of supply chains can be simplified to eliminate redundancy, increase time to value, and reduce cost.

article thumbnail

Real-Time Analytics with dbt + Rockset

Rockset

Rockset was founded to make it easy for developers and data teams to go from real-time data to actionable insights. We designed Rockset to remove many of the barriers teams face while building with real-time data including data preparation, performance tuning and infrastructure management. We also built ground up to support full SQL (including joins and aggregations), the most common query language for analytics.

SQL 52
article thumbnail

A Reference Architecture for the Cloudera Private Cloud Base Data Platform

Cloudera

Introduction and Rationale. The release of Cloudera Data Platform (CDP) Private Cloud Base edition provides customers with a next generation hybrid cloud architecture. This blog post provides an overview of best practice for the design and deployment of clusters incorporating hardware and operating system configuration, along with guidance for networking and security as well as integration with existing enterprise infrastructure.

article thumbnail

Keys to DataOps Transformation

DataKitchen

The post Keys to DataOps Transformation first appeared on DataKitchen.

52
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

How to Get More ROI—Faster—From Machine Learning

Teradata

Find out how to harness machine learning and AI to contain costs, increase revenue, and grow your organization’s customer base. Read more.

article thumbnail

Identifying Document Types at Scribd

Scribd Technology

User-uploaded documents have been a core component of Scribd’s business from the very beginning, understanding what is actually in the document corpus unlocks exciting new opportunities for discovery and recommendation. With Scribd anybody can upload and share documents , analogous to YouTube and videos. Over the years, our document corpus has become larger and more diverse which has made understanding it an ever-increasing challenge.

article thumbnail

Paving the way for women in Tech: Fostering young girls’ enthusiasm for STEM

Cloudera

In the late 90s, when I was pursuing my studies in engineering, only a few girls enrolled in any STEM-related courses. While it was our love for math & science and the prospect of future opportunities that brought us here, we sadly found many of them gave up halfway through the course, and those who graduated either quit or never entered the profession. .

article thumbnail

Top 20 Logistic Regression Interview Questions and Answers

ProjectPro

To become a successful data scientist in the industry, understanding the end-to-end workflow of the data science pipeline (understanding data, data pre-processing, model building, model evaluation, and model deployment) is essential. Assuming you do not want to overwhelm yourself with fancy machine learning algorithms, mastering the concepts of logistic regression should be your primary step to get familiar with the end-to-end data science pipeline.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

The Weekly ETL: Will Data Engineering Ever Be Sexy like Data Science?

Monte Carlo

In Monte Carlo’s Weekly ETL (Explanations Through Lior) series, Lior Gavish, Monte Carlo’s co-founder and CTO, answers a trending question on Reddit about some of data engineering’s hottest topics. Reddit thread can be found here. Reddit user /SWE-Aaron asks if data engineering will ever get the same attention as data science and whether that would actually be a good thing.

article thumbnail

Why It’s Hard for Engineering to Support Marketing

RudderStack

Marketing teams get a bad rap from engineering, oftentimes for understandable reasons.

article thumbnail

Optimizing Risk and Exposure Management – Roundtable Highlights

Cloudera

We recently hosted a roundtable focused on o ptimizing risk and exposure management with data insights. For financial institutions and insurers, risk and exposure management has always been a fundamental tenet of the business. Now, risk management has become exponentially complicated in multiple dimensions. . In this session we explored what firms are doing to approach the uncertainty with more predictability.

article thumbnail

Top 15 Cloud Computing Projects Ideas for Beginners in 2023

ProjectPro

People searching for cloud computing jobs per million grew by approximately 50%. According to an Indeed Jobs report, the share of cloud computing jobs has increased by 42% per million from 2018 to 2021. The global cloud computing market is poised to grow $287.03 billion during 2021-2025. Also, global spending on public cloud services will double by 2023.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Apache Superset 1.2: Release Notes

Preset

We're excited to announce the release of Apache Superset 1.2! In this release post, we will focus on the biggest and most interesting tangible, end-user features.

40
article thumbnail

Announcing Monte Carlo’s Incident IQ, a Root Cause Analysis Workflow for Data Teams

Monte Carlo

Incident IQ gives data engineers and analysts a centralized, all-in-one solution for conducting incident management and root cause analysis on your data pipelines. Video courtesy of Monte Carlo. Today, we are excited to announce the release of Monte Carlo’s data incident management feature, Incident IQ, a new solution that allows data teams to collaboratively identify, alert on, and remediate the root cause of critical data issues before they impact downstream systems and end users.

Food 40
article thumbnail

Courage and Curiosity: Valuable Attributes for Women in Big Data

Cloudera

Last week we held our third Women In Data Webinar, and what a session it was! We were honored to welcome Justyna Lebedyk, Senior Product Owner Big Data, Commerzbank AG, who posed the question “Does diversity win?” . I had the pleasure of chatting with Justyna about the key themes from her talk and what advice she would give to others looking to pursue a career in data. .

article thumbnail

20 Linear Regression Interview Questions and Answers 2023

ProjectPro

Linear Regression is probably one of the most well-known machine learning algorithms. It essentially involves modeling the relation between the given or derived parameters and the target to be learned. Therefore, any machine Learning job interview would be incomplete without a peppering of Linear Regression questions. These linear regression interview questions and answers will help you prepare for your machine learning interview.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Inclusive Leadership Minimises Negative Impact of Workplace Politics

Cloudera

Can an organization eradicate workplace politics completely? Defined by the Harvard Business Review as “a variety of activities associated with the use of influence tactics to improve personal or organizational interests”, politics at the workplace is inevitable. Undeniably, wielding influence to achieve positive outcomes is encouraged. However the question leaders should be asking is, are fragmented individual agendas taking precedence over an organization’s mission?

article thumbnail

Monte Carlo Launches Data Incident Management Feature, Incident IQ, to Help Organizations Achieve Data Trust

Monte Carlo

Monte Carlo , the data reliability company, today released data incident management feature, Incident IQ, a new suite of capabilities that help data engineers better pinpoint, address, and resolve data downtime at scale through the Monte Carlo Data Observability Platform. Incident IQ automatically generates rich insights about critical data issues through root cause analysis, giving teams unprecedented visibility into the end-to-end health and trust of their data beyond the scope of traditional

article thumbnail

How to Become an Artificial Intelligence Engineer in 2023

ProjectPro

The demand for data-related roles has increased massively in the past few years. Companies are actively seeking talent in these areas, and there is a huge market for individuals who can manipulate data, work with large databases and build machine learning algorithms. While data science is the most hyped-up career path in the data industry, it certainly isn't the only one.

article thumbnail

15 Time Series Projects Ideas for Beginners to Practice 2023

ProjectPro

Time series analysis and forecasting is a dark horse in the domain of Data Science. Time series is among the most applied Data Science techniques in various industrial and business operations, such as financial analysis , production planning, supply chain management, and many more. Machine learning for time series is often a neglected topic. More recent techniques, such as natural language processing, pattern recognition, and others usually gain better attention.

Project 40
article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.