article thumbnail

Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling

Data Engineering Podcast

The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design.

article thumbnail

Data Engineering Weekly #182

Data Engineering Weekly

Adopting LLM in SQL-centric workflow is particularly interesting since companies increasingly try text-2-SQL to boost data usage. link] Murat Demirbas: Understanding the Performance Implications of Storage-Disaggregated Databases Serverless of anything (Postgres, Kafka, Redis) is the hot trend in infrastructure development.

article thumbnail

Every Company is Becoming a Software Company

Confluent

Of course, this is not to imply that companies will become only software (there are still plenty of people in even the most software-centric companies), just that the full scope of the business is captured in an integrated software defined process. Here, the bank loan business division has essentially become software.

article thumbnail

The Rise of the Data Engineer

Maxime Beauchemin

Storage and compute is cheaper than ever, and with the advent of distributed databases that scale out linearly, the scarcer resource is engineering time. The use of natural, human readable keys and dimension attributes in fact tables is becoming more common, reducing the need for costly joins that can be heavy on distributed databases.

article thumbnail

Why are database columns 191 characters?

Grouparoo

In this post, we’ll look at the historical reasons for the 191 character limit as a default in most relational databases. The first question you might ask is why limit the length of the strings you can store in a database at all? Why varchar and not text ? s fault 255 makes a lot more sense than 191. How did we get to 191?

article thumbnail

Serverless Data Pipelines On DataCoral

Data Engineering Podcast

Managing and auditing access to your servers and databases is a problem that grows in difficulty alongside the growth of your teams. You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management.

article thumbnail

Bringing Automation To Data Labeling For Machine Learning With Watchful

Data Engineering Podcast

With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Just connect it to your database/data warehouse/data lakehouse/whatever you’re using and let them do the rest.