Data Pipeline, Database-centric and Pipeline-centric

An IBM Z Data Integration Success Story

Precisely

MARCH 28, 2025

Some departments used IBM Db2, while others relied on VSAM files or IMS databases creating complex data governance processes and costly data pipeline maintenance. They realized they needed a more automated, streamlined way to access the data.

Data Integration

Data Integration Pipeline-centric Database-centric Kafka

Serverless Data Pipelines On DataCoral

Data Engineering Podcast

APRIL 7, 2019

Summary How much time do you spend maintaining your data pipeline? This was a fascinating conversation with someone who has spent his entire career working on simplifying complex data problems. Managing and auditing access to your servers and databases is a problem that grows in difficulty alongside the growth of your teams.

Data Pipeline

Data Pipeline Pipeline-centric Database-centric AWS

Data Engineering Weekly #196

Data Engineering Weekly

NOVEMBER 3, 2024

The blog emphasizes the importance of starting with a clear client focus to avoid over-engineering and ensure user-centric development. The article details how these building blocks are used to implement the JSON type, which provides support for dynamically changing data, high-performance storage, scalability, and tuning options.

Data Engineer

Data Engineer Data Engineering Engineering Pipeline-centric

Webinars

How to Achieve High-Accuracy Results When Using LLMs

MORE WEBINARS

The Race For Data Quality in a Medallion Architecture

DataKitchen

NOVEMBER 5, 2024

Bronze layers can also be the raw database tables. Next, data is processed in the Silver layer , which undergoes “just enough” cleaning and transformation to provide a unified, enterprise-wide view of core business entities. Bronze layers should be immutable. However, this architecture is not without its challenges.

Architecture

Architecture Raw Data Pipeline-centric Data Ingestion

Introducing CDP Data Engineering: Purpose Built Tooling For Accelerating Data Pipelines

Cloudera

SEPTEMBER 17, 2020

To tackle these challenges, we’re thrilled to announce CDP Data Engineering (DE) , the only cloud-native service purpose-built for enterprise data engineering teams. Native Apache Airflow and robust APIs for orchestrating and automating job scheduling and delivering complex data pipelines anywhere.

Data Pipeline

Data Pipeline Data Engineer Data Engineering Engineering

Toward a Data Mesh (part 2) : Architecture & Technologies

François Nguyen

MARCH 22, 2021

TL;DR After setting up and organizing the teams, we are describing 4 topics to make data mesh a reality. With this 3rd platform generation, you have more real time data analytics and a cost reduction because it is easier to manage this infrastructure in the cloud thanks to managed services.

Technology

Technology Architecture Google Cloud Metadata

Every Company is Becoming a Software Company

Confluent

SEPTEMBER 25, 2019

Of course, this is not to imply that companies will become only software (there are still plenty of people in even the most software-centric companies), just that the full scope of the business is captured in an integrated software defined process. Here, the bank loan business division has essentially become software.

Database-centric

Database-centric Kafka Pipeline-centric Retail

Bringing Automation To Data Labeling For Machine Learning With Watchful

Data Engineering Podcast

AUGUST 13, 2022

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. That’s where our friends at Ascend.io

Machine Learning

Machine Learning Pipeline-centric Database-centric MongoDB

Data News — Week 23.14

Christophe Blefari

APRIL 8, 2023

At the same time Maxime Beauchemin wrote a post about Entity-Centric data modeling. In the recent years dbt simplified and revolutionised the tooling to create data models. This week I discovered SQLMesh , a all-in-one data pipelines tool. I hope he will fill the gaps. dbt, as of today, is the leading framework.

Pipeline-centric

Pipeline-centric Database-centric Algorithm Data

Data News — Week 13.14

Christophe Blefari

APRIL 8, 2023

At the same time Maxime Beauchemin wrote a post about Entity-Centric data modeling. In the recent years dbt simplified and revolutionised the tooling to create data models. This week I discovered SQLMesh , a all-in-one data pipelines tool. I hope he will fill the gaps. dbt, as of today, is the leading framework.

Pipeline-centric

Pipeline-centric Database-centric Algorithm Data

Data Pipelines in the Healthcare Industry

DareData

JULY 29, 2020

One paper suggests that there is a need for a re-orientation of the healthcare industry to be more "patient-centric". Furthermore, clean and accessible data, along with data driven automations, can assist medical professionals in taking this patient-centric approach by freeing them from some time-consuming processes.

Data Pipeline

Data Pipeline Healthcare Medical Pipeline-centric

End-to-End Data Pipelines: Hitting Home Runs in Data Strategy

Ascend.io

AUGUST 29, 2023

A star-studded baseball team is analogous to an optimized “end-to-end data pipeline” — both require strategy, precision, and skill to achieve success. Just as every play and position in baseball is key to a win, each component of a data pipeline is integral to effective data management.

Data Pipeline

Data Pipeline Pipeline-centric Database-centric Data Ingestion

How to Become a Data Engineer in 2024?

Knowledge Hut

DECEMBER 26, 2023

Data Engineering is typically a software engineering role that focuses deeply on data – namely, data workflows, data pipelines, and the ETL (Extract, Transform, Load) process. What is the role of a Data Engineer? Data Engineers are skilled professionals who lay the foundation of databases and architecture.

Data Engineer

Data Engineer Data Engineering Engineering Hadoop

Data Pipeline vs. ETL: Which Delivers More Value?

Ascend.io

MAY 31, 2023

In the modern world of data engineering, two concepts often find themselves in a semantic tug-of-war: data pipeline and ETL. Fast forward to the present day, and we now have data pipelines. Data Ingestion Data ingestion is the first step of both ETL and data pipelines.

Data Pipeline

Data Pipeline ETL Tools Pipeline-centric Data Warehouse

What is a Data Engineer?

Dataquest

JANUARY 25, 2017

A data scientist is only as good as the data they have access to. Most companies store their data in variety of formats across databases and text files. This is where data engineers come in — they build pipelines that transform that data into formats that data scientists can use. Ride database.

Data Engineer

Data Engineer Data Engineering Pipeline-centric Database-centric

Data Engineering Weekly #186

Data Engineering Weekly

AUGUST 25, 2024

Take Astro (the fully managed Airflow solution) for a test drive today and unlock a suite of features designed to simplify, optimize, and scale your data pipelines. Try For Free → Conference Alert: Data Engineering for AI/ML This is a virtual conference at the intersection of Data and AI.

Data Engineer

Data Engineer Data Engineering Engineering Database-centric

The Rise of Unstructured Data

Cloudera

NOVEMBER 15, 2021

In terms of representation, data can be broadly classified into two types: structured and unstructured. Structured data can be defined as data that can be stored in relational databases, and unstructured data as everything else. There are many implications of large unstructured data for engineering.

Unstructured Data

Unstructured Data Pipeline-centric Database-centric Entertainment

Data Engineering Weekly #161

Data Engineering Weekly

MARCH 3, 2024

Here is the agenda, 1) Data Application Lifecycle Management - Harish Kumar( Paypal) Hear from the team in PayPal on how they build the data product lifecycle management (DPLM) systems. 3) DataOPS at AstraZeneca The AstraZeneca team talks about data ops best practices internally established and what worked and what didn’t work!!!

Data Engineer

Data Engineer Data Engineering Engineering Pipeline-centric

Centralize Your Data Processes With a DataOps Process Hub

DataKitchen

NOVEMBER 4, 2021

Your IT organization may have a permanent data lake, but data analytics teams need the ability to rapidly create insight from data. The DataKitchen Platform serves as a process hub that builds temporary analytic databases for daily and weekly ad hoc analytics work. Figure 3: Example process hub for biologic launch.

Process

Process Data Process Pharmaceutical Data Lake

Rebuilding Netflix Video Processing Pipeline with Microservices

Netflix Tech

JANUARY 10, 2024

The Netflix video processing pipeline went live with the launch of our streaming service in 2007. By integrating with studio content systems, we enabled the pipeline to leverage rich metadata from the creative side and create more engaging member experiences like interactive storytelling.

Process

Process Pipeline-centric Media Metadata

RAG vs Fine Tuning: How to Choose the Right Method

Monte Carlo

MAY 30, 2024

Retrieval augmented generation (RAG) is an architecture framework introduced by Meta in 2020 that connects your large language model (LLM) to a curated, dynamic database. Data retrieval: Based on the query, the RAG system searches the database to find relevant data. A RAG flow in Databricks can be visualized like this.

Pipeline-centric

Pipeline-centric Database-centric Datasets Data Pipeline

Creating Value With a Data-Centric Culture: Essential Capabilities to Treat Data as a Product

Ascend.io

JUNE 8, 2023

Treating data as a product is more than a concept; it’s a paradigm shift that can significantly elevate the value that business intelligence and data-centric decision-making have on the business. Data pipelines Data integrity Data lineage Data stewardship Data catalog Data product costing Let’s review each one in detail.

Pipeline-centric

Pipeline-centric Database-centric Data Ingestion Data Pipeline

?Data Engineer vs Machine Learning Engineer: What to Choose?

Knowledge Hut

JUNE 20, 2023

Factors Data Engineer Machine Learning Definition Data engineers create, maintain, and optimize data infrastructure for data. In addition, they are responsible for developing pipelines that turn raw data into formats that data consumers can use easily. Assess the needs and goals of the business.

Machine Learning

Machine Learning Data Engineer Data Engineering Engineering

Data Engineering Weekly #137

Data Engineering Weekly

JULY 2, 2023

Data Engineering Weekly Is Brought to You by RudderStack RudderStack Profiles takes the SaaS guesswork, and SQL grunt work out of building complete customer profiles, so you can quickly ship actionable, enriched data to every downstream team. The author made an excellent attempt to describe what is vector databases to a 5-year-old.

Data Engineer

Data Engineer Data Engineering Engineering Database-centric

Revolutionizing Build Analytics: How to enhance build processes with ThoughtSpot

ThoughtSpot

OCTOBER 18, 2024

It aims to explain how we transformed our development practices with a data-centric approach and offers recommendations to help your teams address similar challenges in your software development lifecycle. This approach ensured comprehensive data extraction while handling various edge cases and log formats.

Building

Building Process Pipeline-centric Database-centric

Top-Paying Data Engineer Jobs in Singapore [2023 Updated]

Knowledge Hut

FEBRUARY 27, 2023

Engineers work with Data Scientists to help make the most of the data they collect and have deep knowledge of distributed systems and computer science. In large organizations, data engineers concentrate on analytical databases, operate data warehouses that span multiple databases, and are responsible for developing table schemas.

Data Engineer

Data Engineer Data Engineering Database-centric Pipeline-centric

Azure Data Engineer vs Azure DevOps: Top 8 Differences

Knowledge Hut

NOVEMBER 2, 2023

An Azure Data Engineer is a professional responsible for designing, implementing, and managing data solutions using Microsoft's Azure cloud platform. They work with various Azure services and tools to build scalable, efficient, and reliable data pipelines, data storage solutions, and data processing systems.

Data Engineer

Data Engineer Data Engineering Engineering Pipeline-centric

Ripple's Centralized Data Platform

Ripple Engineering

JANUARY 29, 2024

For Ripple's product capabilities, the Payments team of Ripple, for example, ingests millions of transactional records into databases and performs analytics to generate invoices, reports, and other related payment operations. A lack of a centralized system makes building a single source of high-quality data difficult.

Database-centric

Database-centric Pipeline-centric NoSQL High Quality Data

97 things every data engineer should know

Grouparoo

OCTOBER 6, 2021

This provided a nice overview of the breadth of topics that are relevant to data engineering including data warehouses/lakes, pipelines, metadata, security, compliance, quality, and working with other teams. Open question: how to seed data in a staging environment? Test system with A/A test. Be adaptable.

Data Engineer

Data Engineer Data Engineering Engineering Pipeline-centric

Data Lineage Tools: Key Capabilities and 5 Notable Solutions

Databand.ai

JULY 19, 2023

Learn more in our detailed guide to data lineage visualization (coming soon) Integration with Multiple Data Sources Data lineage tools are designed to integrate with a wide range of data sources, including databases, data warehouses, and cloud-based data platforms.

Pipeline-centric

Pipeline-centric Data Governance Metadata Government

Recap of Hadoop News for May 2017

ProjectPro

JUNE 1, 2017

Datos IO has extended its on-premise and public cloud data protection to RDBMS and Hadoop distributions. Its RecoverX distributed database backup product of latest version v2.0 Cloudera is more inclined on becoming a product centric business with 23% of its revenue coming from services past year in comparison to 31% for Hortonworks.

Hadoop

Hadoop Medical Pipeline-centric Database-centric

How to Become an Azure Data Engineer? 2023 Roadmap

Knowledge Hut

NOVEMBER 17, 2023

The demand for data-related professions, including data engineering, has indeed been on the rise due to the increasing importance of data-driven decision-making in various industries. Becoming an Azure Data Engineer in this data-centric landscape is a promising career choice.

Data Engineer

Data Engineer Data Engineering Engineering Scala

Data Contracts and 4 Other Ways to Overcome Schema Changes

Monte Carlo

JULY 28, 2022

But perhaps one of the most common reasons for data quality challenges are software feature updates and other changes made upstream by software engineers. These are particularly frustrating, because while they are breaking data pipelines constantly, it’s not their fault. He suggested : “Private vs. public methods.

Software Engineer

Software Engineer Software Engineering Pipeline-centric Database-centric

Python for Data Engineering

Ascend.io

SEPTEMBER 14, 2023

Data engineers can find one for almost any need, from data extraction to complex transformations, ensuring that they’re not reinventing the wheel by writing code that’s already been written. Exceptional at data retrieval and manipulation within RDBMS. It's specialized for database querying.

Data Engineer

Data Engineer Data Engineering Python Engineering

Azure Synapse vs Databricks: 2023 Comparison Guide

Knowledge Hut

SEPTEMBER 26, 2023

It offers a wide range of services, including computing, storage, databases, machine learning, and analytics, making it a versatile choice for businesses looking to harness the power of the cloud. This cloud-centric approach ensures scalability, flexibility, and cost-efficiency for your data workloads.

Data Lake

Data Lake Database-centric Pipeline-centric Machine Learning

A summary of Gartner’s recent DataOps-driven data engineering best practices article

DataKitchen

FEBRUARY 21, 2023

As a result, a less senior team member was made responsible for modifying a production pipeline. Brooks law (for data): “ Adding data engineer personpower to a late data project makes it later.” Shouldn’t Marcus consider upgrading his technology? A better ETL tool? Pick some other hot tool?

Data Engineer

Data Engineer Data Engineering Engineering Pipeline-centric

What is Application Software? Examples, Types and Functions

Knowledge Hut

APRIL 19, 2023

Owing to the vitality of application software, businesses are actively seeking professionals with excellent technical expertise and a consumer-centric mindset to develop more practical application software systems that enhance customer experience. Database An automated data-keeping system is what a database management system (DBMS) is.

Database-centric

Database-centric Entertainment Education Pipeline-centric

What is Real-time Data Ingestion? Use cases, Tools, Infrastructure

Knowledge Hut

JULY 3, 2023

Real-time Data ingestion performs the utilization of data from various origins, does the data cleaning, validation, and preprocessing operations and at the end store it in the required format, either structured or unstructured. As real-time insights gain popularity, real-time data ingestion remains vital for companies worldwide.

Data Ingestion

Data Ingestion Google Cloud Pipeline-centric Media

Machine Learning Engineer vs Data Scientist - The Differences

ProjectPro

DECEMBER 16, 2021

They need to know everything about the data and apply various mathematical and statistical tools to identify the most significant features using feature selection, feature engineering , feature transformation, etc.

Machine Learning

Machine Learning Engineering Pipeline-centric Database-centric

Experts Share the 5 Pillars Transforming Data & AI in 2024

Monte Carlo

JANUARY 23, 2024

Gen AI can whip up serviceable code in moments — making it much faster to build and test data pipelines. Today’s LLMs can already process enormous amounts of unstructured data, automating much of the monotonous work of data science. But what does that mean for the roles of data engineers and data scientists going forward?

Database-centric

Database-centric Pipeline-centric Metadata Unstructured Data

What is Data Extraction? Examples, Tools & Techniques

Knowledge Hut

JANUARY 30, 2024

Whether you're a seasoned data scientist or just stepping into the world of data, come with me as we unravel the secrets of data extraction and learn how it empowers us to unleash the full potential of data. What is data extraction? Patterns, trends, relationships, and knowledge discovered from the data.

ETL Tools

ETL Tools Database-centric Data Mining Raw Data

The Good and the Bad of Apache Spark Big Data Processing

AltexSoft

JULY 18, 2023

Its flexibility allows it to operate on single-node machines and large clusters, serving as a multi-language platform for executing data engineering , data science , and machine learning tasks. Before diving into the world of Spark, we suggest you get acquainted with data engineering in general.

Big Data

Big Data Data Process Process Hadoop

The Ultimate Modern Data Stack Migration Guide

phData: Data Engineering

JULY 18, 2023

Slow Response to New Information: Legacy data systems often lack the computation power necessary to run efficiently and can be cost-inefficient to scale. This typically results in long-running ETL pipelines that cause decisions to be made on stale or old data.

Data Warehouse

Data Warehouse Pipeline-centric Government Data

What is Azure Data Factory – Here’s Everything You Need to Know

Edureka

JULY 3, 2024

ADF connects to various data sources, including on-premises systems, cloud services, and SaaS applications. It then gathers and relocates information to a centralized hub in the cloud using the Copy Activity within data pipelines. Transform and Enhance the Data: Once centralized, data undergoes transformation and enrichment.

Pipeline-centric

Pipeline-centric Data Lake Database-centric Data Pipeline

An IBM Z Data Integration Success Story

Serverless Data Pipelines On DataCoral

Webinars

Trending Sources

Data Engineering Weekly #196

Webinars

The Race For Data Quality in a Medallion Architecture

Introducing CDP Data Engineering: Purpose Built Tooling For Accelerating Data Pipelines

Toward a Data Mesh (part 2) : Architecture & Technologies

Every Company is Becoming a Software Company

Bringing Automation To Data Labeling For Machine Learning With Watchful

Data News — Week 23.14

Data News — Week 13.14

Data Pipelines in the Healthcare Industry

End-to-End Data Pipelines: Hitting Home Runs in Data Strategy

How to Become a Data Engineer in 2024?

Data Pipeline vs. ETL: Which Delivers More Value?

What is a Data Engineer?

Data Engineering Weekly #186

The Rise of Unstructured Data

Data Engineering Weekly #161

Centralize Your Data Processes With a DataOps Process Hub

Rebuilding Netflix Video Processing Pipeline with Microservices

RAG vs Fine Tuning: How to Choose the Right Method

Creating Value With a Data-Centric Culture: Essential Capabilities to Treat Data as a Product

?Data Engineer vs Machine Learning Engineer: What to Choose?

Data Engineering Weekly #137

Revolutionizing Build Analytics: How to enhance build processes with ThoughtSpot

Top-Paying Data Engineer Jobs in Singapore [2023 Updated]

Azure Data Engineer vs Azure DevOps: Top 8 Differences

Ripple's Centralized Data Platform

97 things every data engineer should know

Data Lineage Tools: Key Capabilities and 5 Notable Solutions

Recap of Hadoop News for May 2017

How to Become an Azure Data Engineer? 2023 Roadmap

Data Contracts and 4 Other Ways to Overcome Schema Changes

Python for Data Engineering

Azure Synapse vs Databricks: 2023 Comparison Guide

A summary of Gartner’s recent DataOps-driven data engineering best practices article

What is Application Software? Examples, Types and Functions

What is Real-time Data Ingestion? Use cases, Tools, Infrastructure

Machine Learning Engineer vs Data Scientist - The Differences

Experts Share the 5 Pillars Transforming Data & AI in 2024

What is Data Extraction? Examples, Tools & Techniques

The Good and the Bad of Apache Spark Big Data Processing

The Ultimate Modern Data Stack Migration Guide

What is Azure Data Factory – Here’s Everything You Need to Know

Stay Connected