This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Strictly Necessary: Used for the proper function of the website
Performance/Analytics: Used for monitoring website traffic and interactions
In this episode Crux CTO Mark Etherington discusses the different costs involved in managing external data, how to think about the total return on investment for your data, and how the Crux platform is architected to reduce the toil involved in managing third party data. Tired of deploying bad data?
Summary Every part of the business relies on data, yet only a small team has the context and expertise to build and maintain workflows and datapipelines to transform, clean, and integrate it. RudderStack’s smart customer datapipeline is warehouse-first. What are the main goals that you have for Shipyard?
Datapipelines are the backbone of your business’s data architecture. Implementing a robust and scalable pipeline ensures you can effectively manage, analyze, and organize your growing data. We’ll answer the question, “What are datapipelines?” Table of Contents What are DataPipelines?
In this episode he shares his thoughts on the strategic and tactical elements of moving your work as a data professional from being task-oriented to being product-oriented and the long term improvements in your productivity that it provides. Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code.
Sherloq Datamanagement is critical when building internal gen AI applications, but it remains a challenge for most companies: Creating a verified source of truth and keeping it up to date with the latest documentation is a highly manual, high-effort task. The judges will deliberate live before naming the 2025 Grand Prize winner.
If data is delayed, outdated, or missing key details, leaders may act on the wrong assumptions. Regulatory Compliance Demands Data Governance: Data privacy laws such as GDPR and CCPA require organizations to track, secure, and audit sensitive information. Start Your Free Trial | Schedule a Demo
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you're ready to build your next pipeline, or want to test out the projects you hear about on the show, you'll need somewhere to deploy it, so check out our friends at Linode.
Tools like Python’s requests library or ETL/ELT tools can facilitate data enrichment by automating the retrieval and merging of external data. Read More: Discover how to build a datapipeline in 6 steps Data Integration Data integration involves combining data from different sources into a single, unified view.
Current open-source frameworks like YAML-based Soda Core, Python-based Great Expectations, and dbt SQL are frameworks to help speed up the creation of data quality tests. They are all in the realm of software, domain-specific language to help you write data quality tests. Download Now Request Demo
Product datamanagement fixes that. Product datamanagement (PDM) is the practice of organizing, storing, and managing all the data related to a product in one central system. Product datamanagement systems bring all of that information together into one structured, searchable place.
In this episode she shares the story behind the project, the details of how it is implemented, and how you can use it for your own data projects. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools.
A well-executed datapipeline can make or break your company’s ability to leverage real-time insights and stay competitive. Thriving in today’s world requires building modern datapipelines that make moving data and extracting valuable insights quick and simple. What is a DataPipeline?
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.
In this episode CEO and founder Salma Bakouk shares her views on the causes and impacts of "data entropy" and how you can tame it before it leads to failures. Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code. What is the motivating goal for the company and product?
If you are starting down the path of implementing a data governance strategy then this episode will provide a great overview of what is involved. If you hand a book to a new data engineer, what wisdom would you add to it? What is data governance? If you hand a book to a new data engineer, what wisdom would you add to it?
In this episode Isaac Brodsky explains how the Unfolded platform is architected, their experience joining the team at Foursquare, and how you can start using it for analyzing your spatial data today. Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you're ready to build your next pipeline, or want to test out the projects you hear about on the show, you'll need somewhere to deploy it, so check out our friends at Linode.
In this episode Arjun Narayan explains how the technical barriers to adopting real-time data in your analytics and applications have become surmountable by organizations of all sizes. Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code.
Summary One of the most critical aspects of software projects is managing its data. Managing the operational concerns for your database can be complex and expensive, especially if you need to scale to large volumes of data, high traffic, or geographically distributed usage.
In this episode he shares his experiences working with organizations to adopt analytics engineering patterns and the ways that Optimus and dbt were combined to let data analysts deliver insights without the roadblocks of complex pipelinemanagement. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold.
He also discusses the improvements that have been incorporated into FeatureBase to simplify integration with the rest of your data stack, and the SQL interface that was added to make working with the product easier. Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.
The Modern Story: Navigating Complexity and Rethinking Data in The Business Landscape Enterprises face a data landscape marked by the proliferation of IoT-generated data, an influx of unstructured data, and a pervasive need for comprehensive data analytics.
Going into the DataPipeline Automation Summit 2023, we were thrilled to connect with our customers and partners and share the innovations we’ve been working on at Ascend. The summit explored the future of datapipeline automation and the endless possibilities it presents.
He describes how the platform is architected, the challenges related to selling cloud technologies into enterprise organizations, and how you can adopt Matillion for your own workflows to reduce the maintenance burden of data integration workflows. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold.
AI-assisted data modeling on shared data workloads Data sharing is critical to inform decision making across the organization. Snowflake eliminates the data sharing complexities of traditional datapipelines —making data secure, governed, and easily ready to query.
Summary Data lineage is the common thread that ties together all of your datapipelines, workflows, and systems. In order to get a holistic understanding of your data quality, where errors are occurring, or how a report was constructed you need to track the lineage of the data from beginning to end.
Many entries also used Snowpark , taking advantage of the ability to work in the code they prefer to develop datapipelines, ML models and apps, then execute in Snowflake. Our sincere thanks go out to everyone who participated in this year’s competition.
If you need to deal with massive data, at high velocities, in milliseconds, then Aerospike is definitely worth learning about. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code.
In this episode Prineha Narang, co-founder and CTO of Aliro, explains how these systems work, the capabilities that they can offer, and how you can start preparing for a post-quantum future for your data systems. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days or even weeks.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. Missing data?
They also discuss how they have established a guild system for training and supporting data professionals in the organization. Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold.
CRN’s The 10 Hottest Data Science & Machine Learning Startups of 2020 (So Far). In June of 2020, CRN featured DataKitchen’s DataOps Platform for its ability to manage the datapipeline end-to-end combining concepts from Agile development, DevOps, and statistical process control: DataKitchen. DataKitchen.
She also discusses her views on the role of the data lakehouse as a building block for these architectures and the ongoing influence that it will have as the technology matures. Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code. Tired of deploying bad data?
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.
The Modern Story: Navigating Complexity and Rethinking Data in The Business Landscape Enterprises face a data landscape marked by the proliferation of IoT-generated data, an influx of unstructured data, and a pervasive need for comprehensive data analytics.
He explains how they redesigned the core algorithms and storage management features to deliver ten times faster throughput, how the lower latencies work to reduce the burden on platform engineers, and how they are working toward an open source offering so that you can try it yourself with no friction. What is your target market and customer?
This allows everyone in the business to participate in data analysis in a sustainable manner. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Modern data teams are dealing with a lot of complexity in their datapipelines and analytical code.
Summary As more organizations are gaining experience with datamanagement and incorporating analytics into their decision making, their next move is to adopt machine learning. Monte Carlo’s end-to-end Data Observability Platform monitors and alerts for data issues across your data warehouses, data lakes, ETL, and business intelligence.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.
In this episode Sean Falconer explains the idea of a data privacy vault and how this new architectural element can drastically reduce the potential for making a mistake with how you manage regulated or personally identifiable information. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern datamanagement When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode.
We organize all of the trending information in your field so you don't have to. Join 37,000+ users and stay up to date on the latest articles your peers are reading.
You know about us, now we want to get to know you!
Let's personalize your content
Let's get even more personalized
We recognize your account from another site in our network, please click 'Send Email' below to continue with verifying your account and setting a password.
Let's personalize your content