Sat.May 06, 2023 - Fri.May 12, 2023

article thumbnail

Datadog’s $65M/year customer mystery solved

The Pragmatic Engineer

The internet has been speculating the past few days on which crypto company spent $65M on Datadog in 2022. I confirmed it was Coinbase, and here are the details of what happened. Originally published on 11 May 2023. 👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of six topics in today’s subscriber-only The Scoop issue.

AWS 318
article thumbnail

OLTP Vs OLAP – What Is The Difference

Seattle Data Guy

If you’re relying on your OLTP system to provide analytics, you might be in for a surprise. While it can work initially, these systems aren’t designed to handle complex queries. Adding databases like MongoDB and CassandraDB only makes matters worse, since they’re not SQL-friendly – the language most analysts and data practitioners are used to.… Read more The post OLTP Vs OLAP – What Is The Difference appeared first on Seattle Data Guy.

MongoDB 208
article thumbnail

Polars – Laziness and SQL Context.

Confessions of a Data Guy

Polars is one of those tools that you just want … no … NEED a reason to use it. It’s gotten so bad, I’ve started to use it in my Rust code on the side, Polars that is. I mean you have a problem if you could use Polars Python, and you find yourself using […] The post Polars – Laziness and SQL Context. appeared first on Confessions of a Data Guy.

SQL 182
article thumbnail

3 Ways to Access GPT-4 for Free

KDnuggets

Test out the top-of-the-line AI model for no charge.

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Compensation at Publicly Traded Tech Companies

The Pragmatic Engineer

Insights from 50 publicly traded tech companies, and a list of those paying the most and the least in median total compensation. 👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover two out of seven topics from today’s subscriber-only deep-dive on Compensation at publicly traded tech companies.

article thumbnail

Data Teams Survey 2023 Follow-Up

Jesse Anderson

The results and analysis from my 2023 Data Teams Survey left a few open questions. Let’s revisit these questions with some answers. Methodologies and Size of Company Figure 1 – Methodologies Broken Down By Size of Company Using Them We see a few commonalities across different company sizes, as shown in Figure 1. One striking commonality is that so many companies are using data mesh.

More Trending

article thumbnail

Practical Statistics for Data Scientists

KDnuggets

Check out these essential statistical concepts for data science.

article thumbnail

PagerDuty alternatives

The Pragmatic Engineer

This is a response to a tweet asking: "Why is there no competition to PagerDuty/Opsgenie? People in my team say it’s “just connecting to the Twilio API” but if it were that easy, there’d probably be a ton of competition." PagerDuty is the market-leading incident alerting tool. OpsGenie is Atlassian's incident management tool, which is widespread thanks to distribution.

Systems 231
article thumbnail

Upscaling LinkedIn's Profile Datastore While Reducing Costs

LinkedIn Engineering

Co-Authors: Estella Pham and Guanlin Lu At peak, LinkedIn serves over 1.4 million member profiles per second. The number of requests to our storage infrastructure doubles every year. In the past, we addressed latency, throughput and cost issues by migrating off Oracle onto Espresso , an open-source document platform, and adding more nodes. We are now at the point where some of the core components are straining under the increasing load, and we can no longer address scaling concerns with the node

Database 133
article thumbnail

How Lakehouse powers NLP for Customer Service Analytics in Insurance

databricks

Download the Databricks Insurance NLP Solution Accelerator Introduction The current economic and social climate has redefined customer expectations and preferences. Society has been.

Insurance 130
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Data Masking: The Core of Ensuring GDPR and other Regulatory Compliance Strategies

KDnuggets

This article has provided an overview of data masking and its importance in ensuring compliance with GDPR and other global regulations.

Data 143
article thumbnail

Data News — Week 23.19

Christophe Blefari

Sorting the news ( credits ) Hey you, new Friday means Data News. This week is pretty stacked in term of content, especially video / audio content. I hope you will enjoy it as much as me. Let's start with with my newly created podcast Minds of Data. In Minds of Data I'll met people from the data ecosystem in order to learn more about them.

Data 130
article thumbnail

Kinesis sequence number is not an Apache Kafka offset

Waitingforcode

I have used to say "Kinesis Data Streams is like Apache Kafka, an append-only streaming broker with partitions and offsets". Although often it's true, it's not that simple unfortunately.

Kafka 130
article thumbnail

Use Consistent And Up To Date Customer Profiles To Power Your Business With Segment Unify

Data Engineering Podcast

Summary Every business has customers, and a critical element of success is understanding who they are and how they are using the companies products or services. The challenge is that most companies have a multitude of systems that contain fragments of the customer's interactions and stitching that together is complex and time consuming. Segment created the Unify product to reduce the burden of building a comprehensive view of customers and synchronizing it to all of the systems that need it.

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

Chatbot Arena: The LLM Benchmark Platform

KDnuggets

Chatbot Arena is a benchmark platform for large language models, where the community can contribute new models and evaluate them.

Process 131
article thumbnail

Unifying Your Data Ecosystem with Delta Lake Integration

databricks

As organizations are maturing their data infrastructure and accumulating more data than ever before in their data lakes, Open and Reliable table formats.

Data Lake 122
article thumbnail

New Approaches to Visualizing Snowflake Query Statistics with Snowflake Technology Partners

Snowflake

As of December, customers got a whole new level of insight into Snowflake query performance and query execution statistics when Snowflake announced the public preview of the new get_query_operator_stats function, opening up programmatic access to Snowflake query profiles and providing customers a whole new level of insight into Snowflake query performance and query execution statistics.

article thumbnail

Tackling the Hidden and Unhidden Costs of Kafka

Confluent

Low utilization and operational complexity dramatically increases Kafka costs, so we reinvented Kafka as a cloud-native and complete service to reduce costs for thousands of businesses at any scale.

Kafka 107
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Mojo Lang: The New Programming Language

KDnuggets

Introducing Mojo Lang, the new programming language designed as a superset of Python.

article thumbnail

Using Structured Streaming with Delta Sharing in Unity Catalog

databricks

We are excited to announce that support for using Structured Streaming with Delta Sharing is now generally available (GA) in Azure, AWS, and.

AWS 105
article thumbnail

Metal as a Service (MaaS): DIY server-management at scale

LinkedIn Engineering

Guaranteeing that our servers are continually upgraded to secure and vetted operating systems is one major step that we take to ensure our members and customers can access LinkedIn to look for new roles, access new learning programs, or exchange knowledge with other professionals. LinkedIn has quite a large fleet of servers on-premise that depend on internal tooling to ensure they stay on the latest operating systems.

article thumbnail

What Makes Confluent the World’s Most Trusted Cloud Data Streaming Platform

Confluent

Confluent manages 30,000+ Kafka clusters, produces over 3 trillion messages, and does durability checks on over 80 trillion Kafka messages per day while offering 99.99% uptime. Check out our cool stats!

Kafka 105
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

8 Free AI and LLMs Playgrounds

KDnuggets

If you’re interested in trying out AI for fun or learning more about them, then take a look at our list and explore the cutting-edge LLMs available in the wild.

122
122
article thumbnail

?? Episode 1 — Joe Reis

Christophe Blefari

Data 100
article thumbnail

Precisely Women in Technology: Meet Samantha Martino

Precisely

Technology is a vast industry that has something for everybody. Because of this, it attracts people from all backgrounds and areas of expertise. At Precisely, having diverse representation is the key to success, and as a result, it’s been highly important for the organization to support the unique perspective that employees bring to the table. The Precisely Women in Technology (PWIT) program was designed to connect women from across the organization to one another to offer support, an internal n

article thumbnail

Cluster Policy Onboarding Primer

databricks

Introduction This blog is part of our Admin Essentials series, where we'll focus on topics important to those managing and maintaining Databricks environments.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Data Scientist’s Guide to Cognitive Biases: A Free eBook

KDnuggets

Are you interested in exploring the topic of cognitive biases? Want to see how they may be affecting your data science practice? Check out this free ebook for this and more.

article thumbnail

Data News — Week 23.18

Christophe Blefari

It's wedding weekend (as you'll probably read it, congrats) ( credits ) Hey you, this is a Saturday edition of the Data News. I hope this email finds you well. This week you'll have less editorial content because I'm late. But still you'll find awesome articles that has been written recently. As a reminder on Tuesday next week I'm organising the Apache Airflow Paris meetup that you should consider joining if in Paris.

article thumbnail

PostgreSQL Import CSV: 3 Easy Methods

Hevo

As a business grows, the demand to efficiently handle and process the exponentially growing data also rises. A popular open-source relational database used by several organizations across the world is PostgreSQL. It is a perfect database management system that also assists developers to build applications, and administrators to protect data integrity and develop fault-tolerant environments.

article thumbnail

Unify Data and Business Teams with Sigma on Databricks

databricks

Data analytics has revolutionized the way businesses operate and compete, enabling them to make informed decisions and innovate their products and services. As.

article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.