Blog, Data Pipeline and Pipeline-centric

Blog

Data Pipeline

Pipeline-centric

Introducing CDP Data Engineering: Purpose Built Tooling For Accelerating Data Pipelines

Cloudera

SEPTEMBER 17, 2020

To tackle these challenges, we’re thrilled to announce CDP Data Engineering (DE) , the only cloud-native service purpose-built for enterprise data engineering teams. Native Apache Airflow and robust APIs for orchestrating and automating job scheduling and delivering complex data pipelines anywhere.

Data Pipeline

Data Pipeline Data Engineering Data Engineer Engineering

Data Engineering Weekly #203

Data Engineering Weekly

JANUARY 12, 2025

With Astro, you can build, run, and observe your data pipelines in one place, ensuring your mission critical data is delivered on time. This blog captures the current state of Agent adoption, emerging software engineering roles, and the use case category.

Pipeline-centric

Pipeline-centric Data Engineering Data Engineer Engineering

Join 37,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Data Engineering Weekly #196

Data Engineering Weekly

NOVEMBER 3, 2024

The blog emphasizes the importance of starting with a clear client focus to avoid over-engineering and ensure user-centric development. link] Gunnar Morling: Revisiting the Outbox Pattern The blog is an excellent summary of the path we crossed with the outbox pattern and the challenges ahead.

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Delivering Modern Enterprise Data Engineering with Cloudera Data Engineering on Azure

Cloudera

JULY 13, 2021

After the launch of CDP Data Engineering (CDE) on AWS a few months ago, we are thrilled to announce that CDE, the only cloud-native service purpose built for enterprise data engineers, is now available on Microsoft Azure. . CDP data lifecycle integration and SDX security and governance. Key features of CDP Data Engineering.

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

The Race For Data Quality in a Medallion Architecture

DataKitchen

NOVEMBER 5, 2024

When data reaches the Gold layer, it is highly curated and structured, offering a single version of the truth for decision-makers across the organization. We have also seen a fourth layer, the Platinum layer , in companies’ proposals that extend the Data pipeline to OneLake and Microsoft Fabric.

Architecture

Architecture Raw Data Pipeline-centric Data Ingestion

Microsoft Fabric vs. Snowflake: Key Differences You Need to Know

Edureka

APRIL 22, 2025

Snowflake is completely managed, but its main focus is on the data warehouse layer, and users need to integrate with other tools for BI, ML, or ETL. Ideal for: Business-centric workflows involving fabric Snowflake = environments with a lot of developers and data engineers 2.

BI Pipeline-centric Data Lake Google Cloud

Use Consistent And Up To Date Customer Profiles To Power Your Business With Segment Unify

Data Engineering Podcast

MAY 7, 2023

However, that's also something we're re-thinking with our warehouse-centric strategy. How does reverse ETL factor into the enrichment process for profile data? Contact Info Kevin LinkedIn Blog Hanhan LinkedIn Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?

Pipeline-centric

Pipeline-centric Data Lake Machine Learning Data Warehouse

Data Pipelines in the Healthcare Industry

DareData

JULY 29, 2020

One paper suggests that there is a need for a re-orientation of the healthcare industry to be more "patient-centric". Furthermore, clean and accessible data, along with data driven automations, can assist medical professionals in taking this patient-centric approach by freeing them from some time-consuming processes.

Data Pipeline

Data Pipeline Healthcare Medical Pipeline-centric

Data News — Week 24.37

Christophe Blefari

SEPTEMBER 13, 2024

NVidia released Eagle a vision-centric multimodal LLM — Look at the example in the Github repo, given an image and a user input the LLM is able to answer things like "Describe the image in detail" or "Which car in the picture is more aerodynamic" based on a drawing. How the UK football rely heavily on data?

Pipeline-centric

Pipeline-centric Data Python Data Science

Rebuilding Netflix Video Processing Pipeline with Microservices

Netflix Tech

JANUARY 10, 2024

This introductory blog focuses on an overview of our journey. Future blogs will provide deeper dives into each service, sharing insights and lessons learned from this process. Future blogs will provide deeper dives into each service, sharing insights and lessons learned from this process.

Process

Process Pipeline-centric Media Metadata

Cloudera Customer Story

Cloudera

DECEMBER 13, 2023

To enable LGIM to better utilize its wealth of data, LGIM required a centralized platform that made internal data discovery easy for all teams and could securely integrate external partners and third-party outsourced data pipelines. The post Cloudera Customer Story appeared first on Cloudera Blog.

Pipeline-centric

Pipeline-centric Professional Services BI Datasets

United Bank Limited optimizes its data analytics with the Cloudera Data Platform (CDP)

Cloudera

JANUARY 24, 2023

To this end, UBL embarked on a data analytics project that would achieve its goals for an improved data environment. Next, it needed to enhance the company’s customer-centric approach for a needs-based alignment of products and services. Mr. Kashif Riaz, head of data and AI at UBL, shared his thoughts on this project. “To

Banking

Banking Data Analytics Pipeline-centric IT

Data Engineering Weekly #186

Data Engineering Weekly

AUGUST 25, 2024

Take Astro (the fully managed Airflow solution) for a test drive today and unlock a suite of features designed to simplify, optimize, and scale your data pipelines. Try For Free → Conference Alert: Data Engineering for AI/ML This is a virtual conference at the intersection of Data and AI.

Data Engineering

Data Engineering Data Engineer Engineering Database-centric

How DataOps is Transforming Commercial Pharma Analytics

DataKitchen

AUGUST 27, 2021

Without DataOps, companies can employ hundreds of data professionals and still struggle. The data pipelines must contend with a high level of complexity – over seventy data sources and a variety of cadences, including daily/weekly updates and builds. That’s the power of DataOps automation. It’s that simple. .

Pharmaceutical

Pharmaceutical Pipeline-centric Data Analytics Data Lake

Data Engineering Weekly #161

Data Engineering Weekly

MARCH 3, 2024

Here is the agenda, 1) Data Application Lifecycle Management - Harish Kumar( Paypal) Hear from the team in PayPal on how they build the data product lifecycle management (DPLM) systems. 3) DataOPS at AstraZeneca The AstraZeneca team talks about data ops best practices internally established and what worked and what didn’t work!!!

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

How to Become a Data Engineer in 2024?

Knowledge Hut

DECEMBER 26, 2023

Data Engineering is typically a software engineering role that focuses deeply on data – namely, data workflows, data pipelines, and the ETL (Extract, Transform, Load) process. What is the role of a Data Engineer? Data scientists and data Analysts depend on data engineers to build these data pipelines.

Data Engineering

Data Engineering Data Engineer Engineering Hadoop

The Rise of Unstructured Data

Cloudera

NOVEMBER 15, 2021

The word “data” is ubiquitous in narratives of the modern world. And data, the thing itself, is vital to the functioning of that world. This blog discusses quantifications, types, and implications of data. Quantifications of data. Addressing the challenges of data. Conclusions. Here we mention two.

Unstructured Data

Unstructured Data Pipeline-centric Database-centric Entertainment

Data Entropy?—?More Data, More Problems?

Towards Data Science

MAY 19, 2023

Business users are unable to find and access data assets critical to their workflows. Data engineers spend countless hours troubleshooting broken pipelines. The data team is constantly burning out and has a high employee turnover. Stakeholders fail to see the ROI behind expensive data initiatives.

Pipeline-centric

Pipeline-centric Data Software Engineering Software Engineer

Creating Value With a Data-Centric Culture: Essential Capabilities to Treat Data as a Product

Ascend.io

JUNE 8, 2023

Treating data as a product is more than a concept; it’s a paradigm shift that can significantly elevate the value that business intelligence and data-centric decision-making have on the business. Data pipelines Data integrity Data lineage Data stewardship Data catalog Data product costing Let’s review each one in detail.

Pipeline-centric

Pipeline-centric Database-centric Data Ingestion Data Pipeline

Centralize Your Data Processes With a DataOps Process Hub

DataKitchen

NOVEMBER 4, 2021

These limited-term databases can be generated as needed from automated recipes (orchestrated pipelines and qualification tests) stored and managed within the process hub. . The process hub capability of the DataKitchen Platform ensures that those processes that act upon data – the tests, the recipes – are shareable and manageable.

Process

Process Data Process Pharmaceutical Data Lake

Data Engineering Weekly #137

Data Engineering Weekly

JULY 2, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork, and SQL grunt work out of building complete customer profiles, so you can quickly ship actionable, enriched data to every downstream team. Similarly, the blog narrates how to use vector searching in Postgres using PGVector.

Data Engineering

Data Engineering Data Engineer Engineering Database-centric

Data Lineage Tools: Key Capabilities and 5 Notable Solutions

Databand.ai

JULY 19, 2023

This helps to enhance data quality, facilitate data governance, and enable regulatory compliance. Notable Data Lineage Tools and Solutions Let’s take a look at several notable data lineage tools that can improve the quality and efficiency of your data pipeline.

Pipeline-centric

Pipeline-centric Data Governance Metadata Government

Snowpark Offers Expanded Capabilities Including Fully Managed Containers, Native ML APIs, New Python Versions, External Access, Enhanced DevOps and More

Snowflake

JUNE 28, 2023

Snowpark is our secure deployment and processing of non-SQL code, consisting of two layers: Familiar Client Side Libraries – Snowpark brings deeply integrated, DataFrame-style programming and OSS compatible APIs to the languages data practitioners like to use. Previously, tasks could be executed as quickly as 1-minute.

Python

Python Accessible Accessibility Pipeline-centric

Data Engineering Weekly #127

Data Engineering Weekly

APRIL 16, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack provides data pipelines that make collecting data from every application, website, and SaaS platform easy, then activating it in your warehouse and business tools. Sign up free to test out the tool today. I print this out and read it a couple of times.

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

Data Contracts and 4 Other Ways to Overcome Schema Changes

Monte Carlo

JULY 28, 2022

But perhaps one of the most common reasons for data quality challenges are software feature updates and other changes made upstream by software engineers. These are particularly frustrating, because while they are breaking data pipelines constantly, it’s not their fault. He suggested : “Private vs. public methods.

Software Engineering

Software Engineering Software Engineer Pipeline-centric Database-centric

A summary of Gartner’s recent DataOps-driven data engineering best practices article

DataKitchen

FEBRUARY 21, 2023

As a result, a less senior team member was made responsible for modifying a production pipeline. When you architect for flexibility, quality, rapid deployment, and real-time data monitoring (in addition to your customer requirements), you move towards a DataOps-centric data engineering practice.

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

The Top Data Analytics and Science Influencers and Content Creators on LinkedIn

Databand.ai

DECEMBER 20, 2022

He is also an open-source developer at The Apache Software Foundation and the author of Hysterical , a popular blog on tech careers and topics like data, coding, and engineering. Through these roles, he has developed a passion for using data and common sense to generate simple, implementable solutions to complex problems.

Data Analytics

Data Analytics Google Cloud Data Science Data Mining

What is Application Software? Examples, Types and Functions

Knowledge Hut

APRIL 19, 2023

Owing to the vitality of application software, businesses are actively seeking professionals with excellent technical expertise and a consumer-centric mindset to develop more practical application software systems that enhance customer experience. This blog has explored its utility, types, and impact on businesses and organizations.

Database-centric

Database-centric Entertainment Education Pipeline-centric

The Ultimate Modern Data Stack Migration Guide

phData: Data Engineering

JULY 18, 2023

Slow Response to New Information: Legacy data systems often lack the computation power necessary to run efficiently and can be cost-inefficient to scale. This typically results in long-running ETL pipelines that cause decisions to be made on stale or old data.

Data Warehouse

Data Warehouse Pipeline-centric Government Data

A Deep Dive into the Power and Principles of Data Vault Modeling

RandomTrees

NOVEMBER 29, 2023

Here if there arises a need to modify the data pipeline , nothing but the data flow from the source to the stage, there is the capability of monitoring the flow processes and other data hold through the governance systems. post which is the ML model trainings.

Data Warehouse

Data Warehouse Data Lake Database-centric Data Cleanse

Elevating Productivity: Cloudera Data Engineering Brings External IDE Connectivity to Apache Spark

Cloudera

NOVEMBER 21, 2024

As advanced analytics and AI continue to drive enterprise strategy, leaders are tasked with building flexible, resilient data pipelines that accelerate trusted insights. A New Level of Productivity with Remote Access The new Cloudera Data Engineering 1.23 Jupyter, PyCharm, and VS Code).

Pipeline-centric

Pipeline-centric Data Engineering Data Engineer Engineering

Data Engineering Weekly #205

Data Engineering Weekly

JANUARY 26, 2025

The article discusses common pitfalls such as absence bias and intervention bias while advocating for a user-centric approach that emphasizes evaluating retrieval accuracy through precision and recall, focusing on recall. Furthermore, the article highlights Glovo's evolution towards a declarative approach to defining data products.

Data Engineering

Data Engineering Data Engineer Pipeline-centric Engineering

Data Engineering Digest

Introducing CDP Data Engineering: Purpose Built Tooling For Accelerating Data Pipelines

Data Engineering Weekly #203

Webinars

Trending Sources

Data Engineering Weekly #196

Webinars

Delivering Modern Enterprise Data Engineering with Cloudera Data Engineering on Azure

The Race For Data Quality in a Medallion Architecture

Microsoft Fabric vs. Snowflake: Key Differences You Need to Know

Use Consistent And Up To Date Customer Profiles To Power Your Business With Segment Unify

Data Pipelines in the Healthcare Industry

Data News — Week 24.37

Rebuilding Netflix Video Processing Pipeline with Microservices

Cloudera Customer Story

United Bank Limited optimizes its data analytics with the Cloudera Data Platform (CDP)

Data Engineering Weekly #186

How DataOps is Transforming Commercial Pharma Analytics

Data Engineering Weekly #161

How to Become a Data Engineer in 2024?

The Rise of Unstructured Data

Data Entropy?—?More Data, More Problems?

Creating Value With a Data-Centric Culture: Essential Capabilities to Treat Data as a Product

Centralize Your Data Processes With a DataOps Process Hub

Data Engineering Weekly #137

Data Lineage Tools: Key Capabilities and 5 Notable Solutions

Snowpark Offers Expanded Capabilities Including Fully Managed Containers, Native ML APIs, New Python Versions, External Access, Enhanced DevOps and More

Data Engineering Weekly #127

Data Contracts and 4 Other Ways to Overcome Schema Changes

A summary of Gartner’s recent DataOps-driven data engineering best practices article

The Top Data Analytics and Science Influencers and Content Creators on LinkedIn

What is Application Software? Examples, Types and Functions

The Ultimate Modern Data Stack Migration Guide

A Deep Dive into the Power and Principles of Data Vault Modeling

Elevating Productivity: Cloudera Data Engineering Brings External IDE Connectivity to Apache Spark

Data Engineering Weekly #205

Stay Connected