article thumbnail

Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling

Data Engineering Podcast

The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.

article thumbnail

Data Engineering Weekly #182

Data Engineering Weekly

Adopting LLM in SQL-centric workflow is particularly interesting since companies increasingly try text-2-SQL to boost data usage. link] Murat Demirbas: Understanding the Performance Implications of Storage-Disaggregated Databases Serverless of anything (Postgres, Kafka, Redis) is the hot trend in infrastructure development.

article thumbnail

Every Company is Becoming a Software Company

Confluent

Of course, this is not to imply that companies will become only software (there are still plenty of people in even the most software-centric companies), just that the full scope of the business is captured in an integrated software defined process. Here, the bank loan business division has essentially become software.

article thumbnail

Serverless Data Pipelines On DataCoral

Data Engineering Podcast

Managing and auditing access to your servers and databases is a problem that grows in difficulty alongside the growth of your teams. You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management.

article thumbnail

The Rise of the Data Engineer

Maxime Beauchemin

Storage and compute is cheaper than ever, and with the advent of distributed databases that scale out linearly, the scarcer resource is engineering time. The use of natural, human readable keys and dimension attributes in fact tables is becoming more common, reducing the need for costly joins that can be heavy on distributed databases.

article thumbnail

Bringing Automation To Data Labeling For Machine Learning With Watchful

Data Engineering Podcast

With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Just connect it to your database/data warehouse/data lakehouse/whatever you’re using and let them do the rest.

article thumbnail

Data Engineering Weekly #174

Data Engineering Weekly

link] Sponsored: DoubleCloud - More than just ClickHouse ClickHouse is the fastest, most resource-efficient OLAP database, which queries billions of rows in milliseconds and is trusted by thousands of companies for real-time analytics. The author highlights the structured approach to building data infrastructure, data management, and metrics.