Data Management and Technology - Data Engineering Digest

The Future of Data Management Is Agentic AI

Snowflake

APRIL 13, 2025

Managing and utilizing data effectively is crucial for organizational success in today's fast-paced technological landscape. The vast amounts of data generated daily require advanced tools for efficient management and analysis. A path forward Agentic AI represents a change in thinking in enterprise data management.

Data Management

Data Management Management Consulting Unstructured Data

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Data Engineering Podcast

FEBRUARY 25, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines. Can you describe the operational/architectural aspects of building a full data engine on top of the FDAP stack?

Database

Database Technology Data Lake High Quality Data

9 Habits Of Effective Data Managers – Running A Data Team

Seattle Data Guy

JULY 2, 2024

Data teams are expected to juggle a combination of ad-hoc requests, big bet projects, migrations, etc. All while keeping up with the latest changes in technology.

Data Management

Data Management Management Data Project

Webinars

Agent Tooling: Connecting AI to Your Tools, Systems & Data

How to Modernize Manufacturing Without Losing Control

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Transforming Omics Data Management with Databricks Data Intelligence Platform

databricks

SEPTEMBER 30, 2024

This blog explores how new technologies such as Databricks Data Intelligence Platform can pave the way for more effective and efficient multi-omics data management.

Data Management

Data Management Management Data Technology

The AI Superhero Approach to Product Management

Speaker: Conrado Morlan

In this engaging and witty talk, industry expert Conrado Morlan will explore how artificial intelligence can transform the daily tasks of product managers into streamlined, efficient processes. The Future of Product Management 🔮 How to continuously integrate AI into your work to stay ahead of emerging trends and technologies.

Management

Data Integrity for AI: What’s Old is New Again

Precisely

JANUARY 9, 2025

Disclaimer: Throughout this post, I discuss a variety of complex technologies but avoid trying to explain how these technologies work. The goal of this post is to understand how data integrity best practices have been embraced time and time again, no matter the technology underpinning. Then came Big Data and Hadoop!

Data Integration

Data Integration Hadoop Data Warehouse Data Lake

Master Data Management: Common Misconceptions You Should Know

Precisely

OCTOBER 23, 2023

When most people think of master data management, they first think of customers and products. But master data encompasses so much more than data about customers and products. Challenges of Master Data Management A decade ago, master data management (MDM) was a much simpler proposition than it is today.

Data Management

Data Management Management Data Data Integration

Top 10 Trending Courses in Information Technology 2023

Knowledge Hut

NOVEMBER 16, 2023

The best part to jump on the bandwagon of information technology or IT is, there is an enormous possibility for an individual if he or she starts studying for a diploma or a degree, does either a master's degree or a research course. He or she can get a full-fledged engineering degree. You can learn CCNA, CCNP and more from CISCO academy.

Technology

Technology MySQL MongoDB Google Cloud

Making The Total Cost Of Ownership For External Data Manageable With Crux

Data Engineering Podcast

JULY 17, 2022

In this episode Crux CTO Mark Etherington discusses the different costs involved in managing external data, how to think about the total return on investment for your data, and how the Crux platform is architected to reduce the toil involved in managing third party data. When is Crux the wrong choice?

Data Management

Data Management Management Metadata MongoDB

Big Data Technologies that Everyone Should Know in 2024

Knowledge Hut

APRIL 25, 2024

Big data in information technology is used to improve operations, provide better customer service, develop customized marketing campaigns, and take other actions to increase revenue and profits. In the world of technology, things are always changing. It is especially true in the world of big data.

Big Data

Big Data Technology Hadoop NoSQL

Realtime Data Applications Made Easier With Meroxa

Data Engineering Podcast

APRIL 23, 2023

In this episode DeVaris Brown discusses the types of applications that are possible when teams don't have to manage the complex infrastructure necessary to support continuous data flows. Can you describe what Meroxa is and the story behind it? How have the focus and goals of the platform and company evolved over the past 2 years?

Data Lake

Data Lake Kafka Machine Learning Data Warehouse

IBM Technology Chooses Cloudera as its Preferred Partner for Addressing Real Time Data Movement Using Kafka

Cloudera

SEPTEMBER 26, 2023

IBM and Cloudera’s common goal is to accelerate data-driven decision making for enterprise customers, working on defining and executing the best solution for each customer. You can now elevate your data potential and activate AI’s capabilities through the synergic integration between IBM watsonx and Cloudera.

Kafka

Kafka Technology IT Government

Modern Data Governance: Trends for 2025

Precisely

JANUARY 30, 2025

Integrate data governance and data quality practices to create a seamless user experience and build trust in your data. When planning your data governance approach, start small, iterate purposefully, and foster data literacy to drive meaningful business outcomes.

Data Governance

Data Governance Government Metadata Data

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Data Engineering Podcast

FEBRUARY 18, 2024

In this episode Dain Sundstrom, CTO of Starburst, explains how the combination of the Trino query engine and the Iceberg table format offer the ease of use and execution speed of data warehouses with the infinite storage and scalability of data lakes. What do you have planned for the future of Trino/Starburst?

Data Lake

Data Lake High Quality Data Data Warehouse Google Cloud

Expert Insights for Your 2025 Data, Analytics, and AI Initiatives

Precisely

NOVEMBER 18, 2024

Data quality and data governance are the top data integrity challenges, and priorities. A long-term approach to your data strategy is key to success as business environments and technologies continue to evolve. However, they require a strong data foundation to be effective. Take a proactive approach.

Data Analytics

Data Analytics Data Governance Data Integration Government

Troubleshooting Kafka In Production

Data Engineering Podcast

DECEMBER 24, 2023

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. Can you describe your experiences with Kafka?

Kafka

Kafka Data Lake High Quality Data SQL

Modern Data Architecture: Data Mesh and Data Fabric 101

Precisely

OCTOBER 31, 2024

Key Takeaways: Data mesh is a decentralized approach to data management, designed to shift creation and ownership of data products to domain-specific teams. Data fabric is a unified approach to data management, creating a consistent way to manage, access, and share data across distributed environments.

Data Architecture

Data Architecture Architecture Metadata Government

An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem

Data Engineering Podcast

SEPTEMBER 10, 2023

Summary Data systems are inherently complex and often require integration of multiple technologies. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. container orchestration, generalized workflow orchestration, etc.)

BI

BI SQL Machine Learning Data

Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh

Data Engineering Podcast

JUNE 25, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Can you describe what SQLMesh is and the story behind it? DataOps is a term that has been co-opted and overloaded.

Data Engineering

Data Engineering Data Engineer Python Engineering

Cloudera’s QATS Certification for Dell PowerScale Unleashes a New Era of Data Management

Cloudera

NOVEMBER 28, 2023

With its rise in popularity generative AI has emerged as a top CEO priority, and the importance of performant, seamless, and secure data management and analytics solutions to power those AI applications is essential. This means you can expect simpler data management and drastically improved productivity for your business users.

Certification

Certification Data Management Management Cloud

Practical First Steps In Data Governance For Long Term Success

Data Engineering Podcast

JUNE 2, 2024

Nicola Askham found her way into data governance by accident, and stayed because of the benefit that she was able to provide by serving as a bridge between the technology and business. In this episode she shares the practical steps to implementing a data governance practice in your organization, and the pitfalls to avoid.

Data Governance

Data Governance Government Data Lake High Quality Data

The Future of Data Lakehouses: A Fireside Chat with Vinoth Chandar - Founder CEO Onehouse & PMC Chair of Apache Hudi

Data Engineering Weekly

JANUARY 8, 2025

Together, we discussed how Hudi drives innovation, the state of open standards, and what lies ahead for data lakehouses in 2025 and beyond. This foundational concept addresses a key challenge for enterprises: building scalable, high-performing data platforms that can support the complexity of modern data ecosystems.

Data Lake

Data Lake Datasets Retail Data Ingestion

Data News — Week 24.11

Christophe Blefari

MARCH 15, 2024

AI News 🤖 Mira Murati answers the Wall Street Journal about OpenAI Sora — OpenAI CTO has been asked a few questions about the underlying technology in Sora. The technology under this, is, Cityvision. Pandera, a data validation library for dataframes, now supports Polars. She revealed a few insights.

Metadata

Metadata Data Data Warehouse Software Engineer

Top 10 Data Engineering Trends in 2025

Edureka

APRIL 22, 2025

This blog will explore the significant advancements, challenges, and opportunities impacting data engineering in 2025, highlighting the increasing importance for companies to stay updated. Key Trends in Data Engineering for 2025 In the fast-paced world of technology, data engineering services keep companies that focus on data running.

Data Engineering

Data Engineering Data Engineer Engineering Consulting

Making Email Better With AI At Shortwave

Data Engineering Podcast

APRIL 21, 2024

Summary Generative AI has rapidly transformed everything in the technology sector. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Dagster offers a new approach to building and running data platforms and data pipelines.

Data Lake

Data Lake High Quality Data Machine Learning Data Pipeline

Stitching Together Enterprise Analytics With Microsoft Fabric

Data Engineering Podcast

JUNE 23, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Data lakes in various forms have been gaining significant popularity as a unified interface to an organization's analytics. Closing Announcements Thank you for listening!

Data Lake

Data Lake High Quality Data Hadoop Machine Learning

2024 Governance Trends for Data Leaders

phData: Data Engineering

NOVEMBER 1, 2024

Quotes It's extremely important because many of the Gen AI and LLM applications take an unstructured data approach, meaning many of the tools require you to give the tools full access to your data in an unrestricted way and let it crawl and parse it completely. Data governance is the only way to ensure those requirements are met.

Government

Government Data Governance Finance Metadata

Being Data Driven At Stripe With Trino And Iceberg

Data Engineering Podcast

JUNE 16, 2024

In this episode Kevin Liu shares some of the interesting features that they have built by combining those technologies, as well as the challenges that they face in supporting the myriad workloads that are thrown at this layer of their data platform. Can you describe what role Trino and Iceberg play in Stripe's data architecture?

Data Lake

Data Lake High Quality Data Metadata Machine Learning

Trends and Takeaways from Banking and Payments’ Event of the Year

Snowflake

NOVEMBER 11, 2024

Internally, banks are using AI to reduce the burden of data management, including data lineage and data quality controls, or drive efficiencies with business intelligence particularly in call centers. Those requirements can be fulfilled by leveraging cloud infrastructure and services.

Banking

Banking Finance Retail Food

Snowflake Startup Spotlight: Innova-Q

Snowflake

APRIL 7, 2025

Our leadership combines decades of experience in product safety and quality management with cutting-edge expertise in AI, data science and regulatory insights. We are inspired by the transformative potential of technology to solve persistent challenges in product quality and compliance that we experienced firsthand.

Food

Food Data Transparency Software Engineer Software Engineering

X-Ray Vision For Your Flink Stream Processing With Datorios

Data Engineering Podcast

JUNE 9, 2024

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management This episode is supported by Code Comments, an original podcast from Red Hat. Data observability has been gaining adoption for a number of years now, with a large focus on data warehouses.

Process

Process Data Lake High Quality Data Machine Learning

Keep Your Data Lake Fresh With Real Time Streams Using Estuary

Data Engineering Podcast

MAY 21, 2023

In this episode David Yaffe and Johnny Graettinger share the story behind the business and technology and how you can start using it today to build a real-time data lake without all of the headache. Stream processing technologies have been around for around a decade. Can you describe what Estuary is and the story behind it?

Data Lake

Data Lake Machine Learning Kafka Data Warehouse

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

Data Engineering Podcast

OCTOBER 15, 2023

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.

Process

Process Building SQL BI

Surveying The Market Of Database Products

Data Engineering Podcast

OCTOBER 29, 2023

In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles.

Database

Database SQL BI Machine Learning

Building Linked Data Products With JSON-LD

Data Engineering Podcast

SEPTEMBER 17, 2023

Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. What is the overlap between knowledge graphs and "linked data products"?

Building

Building SQL BI Python

4 Practical Tips for Implementing Data-Driven Personalization

Precisely

NOVEMBER 11, 2024

For successful personalization, you need to unify your communication technology. This involves integrating customer data across various channels – like your CRM systems, data warehouses, and more – so that the most relevant and up-to-date information is used consistently in your customer interactions. Focus on high-quality data.

High Quality Data

High Quality Data Data Data Warehouse Technology

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Data Engineering Podcast

MARCH 24, 2024

In this episode Pete Hunt, CEO of Dagster labs, outlines these new capabilities, how they reduce the burden on data teams, and the increased collaboration that they enable across teams and business units. Can you describe what the focus of Dagster+ is and the story behind it? What problems are you trying to solve with Dagster+?

Data Lake

Data Lake High Quality Data Hadoop Machine Learning

Tackling Real Time Streaming Data With SQL Using RisingWave

Data Engineering Podcast

FEBRUARY 4, 2024

In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex.

SQL

SQL Data Lake High Quality Data Machine Learning

Expert Insights for Your 2025 Data, Analytics, and AI Initiatives

Precisely

NOVEMBER 18, 2024

Data quality and data governance are the top data integrity challenges, and priorities. A long-term approach to your data strategy is key to success as business environments and technologies continue to evolve. However, they require a strong data foundation to be effective. Take a proactive approach.

Data Analytics

Data Analytics Data Governance Government Data Integration

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

Data Engineering Podcast

JUNE 30, 2024

He highlights the role of data teams in modern organizations and how Synq is empowering them to achieve this. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. Can you describe what Synq is and the story behind it?

Pipeline-centric

Pipeline-centric Engineering Data Lake High Quality Data

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Data Engineering Podcast

APRIL 7, 2024

Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem. What do you have planned for the future of Cube?

Data Lake

Data Lake High Quality Data BI Data Workflow

Reflecting On The Past 6 Years Of Data Engineering

Data Engineering Podcast

FEBRUARY 5, 2023

Summary This podcast started almost exactly six years ago, and the technology landscape was much different than it is now. In that time there have been a number of generational shifts in how data engineering is done. Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?

Data Engineering

Data Engineering Data Engineer Engineering PostgreSQL

When And How To Conduct An AI Program

Data Engineering Podcast

MARCH 3, 2024

Summary Artificial intelligence technologies promise to revolutionize business and produce new sources of value. Colleen Tartow has worked across all stages of the data lifecycle, and in this episode she shares her hard-earned wisdom about how to conduct an AI program for your organization.

Programming

Programming Data Lake High Quality Data Machine Learning

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Data Engineering Podcast

JANUARY 7, 2024

Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. What do you have planned for the future of your academic research?

Data Process

Data Process Process Data Lake High Quality Data

The Future of Data Management Is Agentic AI

Find Out About The Technology Behind The Latest PFAD In Analytical Database Development

Webinars

Trending Sources

9 Habits Of Effective Data Managers – Running A Data Team

Webinars

Transforming Omics Data Management with Databricks Data Intelligence Platform

The AI Superhero Approach to Product Management

Data Integrity for AI: What’s Old is New Again

Master Data Management: Common Misconceptions You Should Know

Top 10 Trending Courses in Information Technology 2023

Making The Total Cost Of Ownership For External Data Manageable With Crux

Big Data Technologies that Everyone Should Know in 2024

Realtime Data Applications Made Easier With Meroxa

IBM Technology Chooses Cloudera as its Preferred Partner for Addressing Real Time Data Movement Using Kafka

Modern Data Governance: Trends for 2025

Using Trino And Iceberg As The Foundation Of Your Data Lakehouse

Expert Insights for Your 2025 Data, Analytics, and AI Initiatives

Troubleshooting Kafka In Production

Modern Data Architecture: Data Mesh and Data Fabric 101

An Overview Of The Sate Of Data Orchestration In An Increasingly Complex Data Ecosystem

Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh

Cloudera’s QATS Certification for Dell PowerScale Unleashes a New Era of Data Management

Practical First Steps In Data Governance For Long Term Success

The Future of Data Lakehouses: A Fireside Chat with Vinoth Chandar - Founder CEO Onehouse & PMC Chair of Apache Hudi

Data News — Week 24.11

Top 10 Data Engineering Trends in 2025

Making Email Better With AI At Shortwave

Stitching Together Enterprise Analytics With Microsoft Fabric

2024 Governance Trends for Data Leaders

Being Data Driven At Stripe With Trino And Iceberg

Trends and Takeaways from Banking and Payments’ Event of the Year

Snowflake Startup Spotlight: Innova-Q

X-Ray Vision For Your Flink Stream Processing With Datorios

Keep Your Data Lake Fresh With Real Time Streams Using Estuary

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable

Surveying The Market Of Database Products

Building Linked Data Products With JSON-LD

4 Practical Tips for Implementing Data-Driven Personalization

Ship Smarter Not Harder With Declarative And Collaborative Data Orchestration On Dagster+

Tackling Real Time Streaming Data With SQL Using RisingWave

Expert Insights for Your 2025 Data, Analytics, and AI Initiatives

Improve Data Quality Through Engineering Rigor And Business Engagement With Synq

Establish A Single Source Of Truth For Your Data Consumers With A Semantic Layer

Reflecting On The Past 6 Years Of Data Engineering

When And How To Conduct An AI Program

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Stay Connected