article thumbnail

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel

Data Engineering Podcast

Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. What do you have planned for the future of your academic research?

article thumbnail

Master Data Management: Common Misconceptions You Should Know

Precisely

When most people think of master data management, they first think of customers and products. But master data encompasses so much more than data about customers and products. Challenges of Master Data Management A decade ago, master data management (MDM) was a much simpler proposition than it is today.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Composable data management at Meta

Engineering at Meta

In recent years, Meta’s data management systems have evolved into a composable architecture that creates interoperability, promotes reusability, and improves engineering efficiency. Data is at the core of every product and service at Meta. Data is at the core of every product and service at Meta.

article thumbnail

Mastering Batch Data Processing with Versatile Data Kit (VDK)

Towards Data Science

Data Management A tutorial on how to use VDK to perform batch data processing Photo by Mika Baumeister on Unsplash Versatile Data Ki t (VDK) is an open-source data ingestion and processing framework designed to simplify data management complexities.

article thumbnail

Aligning Velox and Apache Arrow: Towards composable data management

Engineering at Meta

This new convergence helps Meta and the larger community build data management systems that are unified, more efficient, and composable. Meta’s Data Infrastructure teams have been rethinking how data management systems are designed.

article thumbnail

Massively Parallel Data Processing In Python Without The Effort Using Bodo

Data Engineering Podcast

In this episode Ehsan Totoni explains how he built the Bodo project to bring the speed and processing power of HPC techniques to the Python data ecosystem without requiring any re-work. What are the techniques/technologies that teams might use to optimize or scale out their data processing workflows?

article thumbnail

Build Your Python Data Processing Your Way And Run It Anywhere With Fugue

Data Engineering Podcast

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.

Python 100