Remove Database Remove Database-centric Remove Pipeline-centric
article thumbnail

Data Engineering Weekly #182

Data Engineering Weekly

Adopting LLM in SQL-centric workflow is particularly interesting since companies increasingly try text-2-SQL to boost data usage. Pipeline breakpoint feature. The blog highlights the 2024 Sigmod paper Understanding the Performance Implications of the Design Principles in Storage-Disaggregated Databases.

article thumbnail

Every Company is Becoming a Software Company

Confluent

Of course, this is not to imply that companies will become only software (there are still plenty of people in even the most software-centric companies), just that the full scope of the business is captured in an integrated software defined process. Here, the bank loan business division has essentially become software.

article thumbnail

Serverless Data Pipelines On DataCoral

Data Engineering Podcast

Summary How much time do you spend maintaining your data pipeline? Managing and auditing access to your servers and databases is a problem that grows in difficulty alongside the growth of your teams. How does the data-centric approach of DataCoral differ from the way that other platforms think about processing information?

article thumbnail

Bringing Automation To Data Labeling For Machine Learning With Watchful

Data Engineering Podcast

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. Data stacks are becoming more and more complex.

article thumbnail

The Rise of the Data Engineer

Maxime Beauchemin

Storage and compute is cheaper than ever, and with the advent of distributed databases that scale out linearly, the scarcer resource is engineering time. The use of natural, human readable keys and dimension attributes in fact tables is becoming more common, reducing the need for costly joins that can be heavy on distributed databases.

article thumbnail

Data Engineering Weekly #174

Data Engineering Weekly

link] Sponsored: DoubleCloud - More than just ClickHouse ClickHouse is the fastest, most resource-efficient OLAP database, which queries billions of rows in milliseconds and is trusted by thousands of companies for real-time analytics. The author highlights the structured approach to building data infrastructure, data management, and metrics.

article thumbnail

Building a Scalable Search Architecture

Confluent

As the databases professor at my university used to say, it depends. Using SQL to run your search might be enough for your use case, but as your project requirements grow and more advanced features are needed—for example, enabling synonyms, multilingual search, or even machine learning—your relational database might not be enough.