This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Strictly Necessary: Used for the proper function of the website
Performance/Analytics: Used for monitoring website traffic and interactions
Managing and utilizing data effectively is crucial for organizational success in today's fast-paced technological landscape. The vast amounts of data generated daily require advanced tools for efficient management and analysis. A path forward Agentic AI represents a change in thinking in enterprise datamanagement.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement Dagster offers a new approach to building and running data platforms and data pipelines. Can you describe the operational/architectural aspects of building a full data engine on top of the FDAP stack?
Challenges around data literacy, readiness, and risk exposure need to be addressed – otherwise they can hinder MDM’s success Businesses that excel with MDM and data integrity can trust their data to inform high-velocity decisions, and remain compliant with emerging regulations. Today, you have more data than ever.
Data teams are expected to juggle a combination of ad-hoc requests, big bet projects, migrations, etc. All while keeping up with the latest changes in technology.
In this engaging and witty talk, industry expert Conrado Morlan will explore how artificial intelligence can transform the daily tasks of product managers into streamlined, efficient processes. The Future of Product Management 🔮 How to continuously integrate AI into your work to stay ahead of emerging trends and technologies.
Azure SQL Database Limitations Azure SQL Database is a powerful and flexible cloud-based managed database service, but like any technology, it has its limitations. You can refer to the Azure SQL Database documentation for more information on the pricing.
Disclaimer: Throughout this post, I discuss a variety of complex technologies but avoid trying to explain how these technologies work. The goal of this post is to understand how data integrity best practices have been embraced time and time again, no matter the technology underpinning. Then came Big Data and Hadoop!
This blog explores how new technologies such as Databricks Data Intelligence Platform can pave the way for more effective and efficient multi-omics datamanagement.
Explore the advanced features of this powerful cloud-based solution and take your datamanagement to the next level with this comprehensive guide. To gain a competitive edge in today's fast-growing big data industry, it's crucial to have hands-on experience with this cutting-edge technology.
In this episode DeVaris Brown discusses the types of applications that are possible when teams don't have to manage the complex infrastructure necessary to support continuous data flows. Can you describe what Meroxa is and the story behind it? How have the focus and goals of the platform and company evolved over the past 2 years?
Openness Most technologies have some degree of lock-in, but nothing has more lock-in than traditional OLTP databases. At its core, a lakebase is grounded in battle-tested, open source technologies. As a result, there has been very little innovation in this space for decades.
Integrate data governance and data quality practices to create a seamless user experience and build trust in your data. When planning your data governance approach, start small, iterate purposefully, and foster data literacy to drive meaningful business outcomes.
In this episode Dain Sundstrom, CTO of Starburst, explains how the combination of the Trino query engine and the Iceberg table format offer the ease of use and execution speed of data warehouses with the infinite storage and scalability of data lakes. What do you have planned for the future of Trino/Starburst?
Data quality and data governance are the top data integrity challenges, and priorities. A long-term approach to your data strategy is key to success as business environments and technologies continue to evolve. However, they require a strong data foundation to be effective. Take a proactive approach.
Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement Introducing RudderStack Profiles. Can you describe your experiences with Kafka?
Today, Snowflake advances our vision to be the ultimate platform for data-driven innovation with our announcement that we have agreed to acquire Crunchy Data, a leading provider of trusted, open source PostgreSQL technology. Crunchy Data is also a proven innovator when it comes to creating a great experience for developers.
Key Takeaways: Data mesh is a decentralized approach to datamanagement, designed to shift creation and ownership of data products to domain-specific teams. Data fabric is a unified approach to datamanagement, creating a consistent way to manage, access, and share data across distributed environments.
Summary Data systems are inherently complex and often require integration of multiple technologies. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. container orchestration, generalized workflow orchestration, etc.)
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement RudderStack helps you build a customer data platform on your warehouse or data lake. Can you describe what SQLMesh is and the story behind it? DataOps is a term that has been co-opted and overloaded.
AI News 🤖 Mira Murati answers the Wall Street Journal about OpenAI Sora — OpenAI CTO has been asked a few questions about the underlying technology in Sora. The technology under this, is, Cityvision. Pandera, a data validation library for dataframes, now supports Polars. She revealed a few insights.
Together, we discussed how Hudi drives innovation, the state of open standards, and what lies ahead for data lakehouses in 2025 and beyond. This foundational concept addresses a key challenge for enterprises: building scalable, high-performing data platforms that can support the complexity of modern data ecosystems.
Summary Generative AI has rapidly transformed everything in the technology sector. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement Dagster offers a new approach to building and running data platforms and data pipelines.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagementData lakes are notoriously complex. Data lakes in various forms have been gaining significant popularity as a unified interface to an organization's analytics. Closing Announcements Thank you for listening!
Quotes It's extremely important because many of the Gen AI and LLM applications take an unstructured data approach, meaning many of the tools require you to give the tools full access to your data in an unrestricted way and let it crawl and parse it completely. Data governance is the only way to ensure those requirements are met.
Internally, banks are using AI to reduce the burden of datamanagement, including data lineage and data quality controls, or drive efficiencies with business intelligence particularly in call centers. Those requirements can be fulfilled by leveraging cloud infrastructure and services.
In this episode Kevin Liu shares some of the interesting features that they have built by combining those technologies, as well as the challenges that they face in supporting the myriad workloads that are thrown at this layer of their data platform. Can you describe what role Trino and Iceberg play in Stripe's data architecture?
However, I also expect a new, reverse trend to take shape: IT teams and data scientists will start to glean even greater business acumen to plug into the broader needs of the enterprise. With so much data being fed into AI model services, security and governance will also come to the fore.
Nicola Askham found her way into data governance by accident, and stayed because of the benefit that she was able to provide by serving as a bridge between the technology and business. In this episode she shares the practical steps to implementing a data governance practice in your organization, and the pitfalls to avoid.
In this episode David Yaffe and Johnny Graettinger share the story behind the business and technology and how you can start using it today to build a real-time data lake without all of the headache. Stream processing technologies have been around for around a decade. Can you describe what Estuary is and the story behind it?
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team.
In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement Introducing RudderStack Profiles.
Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked datatechnologies provide a means of tightly coupling metadata with raw information. What is the overlap between knowledge graphs and "linked data products"?
In this episode Pete Hunt, CEO of Dagster labs, outlines these new capabilities, how they reduce the burden on data teams, and the increased collaboration that they enable across teams and business units. Can you describe what the focus of Dagster+ is and the story behind it? What problems are you trying to solve with Dagster+?
For successful personalization, you need to unify your communication technology. This involves integrating customer data across various channels – like your CRM systems, data warehouses, and more – so that the most relevant and up-to-date information is used consistently in your customer interactions. Focus on high-quality data.
In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagementData lakes are notoriously complex.
Data quality and data governance are the top data integrity challenges, and priorities. A long-term approach to your data strategy is key to success as business environments and technologies continue to evolve. However, they require a strong data foundation to be effective. Take a proactive approach.
He highlights the role of data teams in modern organizations and how Synq is empowering them to achieve this. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagementData lakes are notoriously complex. Can you describe what Synq is and the story behind it?
This blog will explore the significant advancements, challenges, and opportunities impacting data engineering in 2025, highlighting the increasing importance for companies to stay updated. Key Trends in Data Engineering for 2025 In the fast-paced world of technology, data engineering services keep companies that focus on data running.
To address these challenges, we made substantial investments in advanced data understanding technologies, as part of our Privacy Aware Infrastructure (PAI). Specifically, we have adopted a “shift-left” approach, integrating data schematization and annotations early in the product development process.
Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological solution to the problem. What do you have planned for the future of Cube?
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement This episode is supported by Code Comments, an original podcast from Red Hat. Data observability has been gaining adoption for a number of years now, with a large focus on data warehouses.
We organize all of the trending information in your field so you don't have to. Join 37,000+ users and stay up to date on the latest articles your peers are reading.
You know about us, now we want to get to know you!
Let's personalize your content
Let's get even more personalized
We recognize your account from another site in our network, please click 'Send Email' below to continue with verifying your account and setting a password.
Let's personalize your content