Remove Blog Remove Bytes Remove Coding
article thumbnail

Fault Tolerance in Distributed Systems: Tracing with Apache Kafka and Jaeger

Confluent

Instead, in this post I will point you to an earlier blog post where I already answered that question and then I will focus on what should be your next question: now that I’m relying on Jaeger to trace how data is flowing through my distributed system, what if Jaeger goes down? Distributed tracing with Apache Kafka and Jaeger.

Kafka 54
article thumbnail

Improving Efficiency Of Goku Time Series Database at Pinterest (Part?—?1)

Pinterest Engineering

In the first blog, we will share a short summary on the GokuS and GokuL architecture, data format for Goku Long Term, and how we improved the bootstrap time for our storage and serving components. More information about the architecture can be found in the GokuL blog and the cost reduction blog.

Database 111
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Pinterest is now on HTTP/3

Pinterest Engineering

These advancements fit well with Pinterest use cases  —  enabling faster connection establishment (time to first byte of first request), improved congestion control (large media as we have), multiplexing without TCP head-of-line blocking (multiple downloads at the same time), and continued in-flight requests when pinners’ device network/ip changes.

Bytes 141
article thumbnail

Open-Sourcing AvroTensorDataset: A Performant TensorFlow Dataset For Processing Avro Data

LinkedIn Engineering

In this blog post, we will discuss the AvroTensorDataset API, techniques we used to improve data processing speeds by up to 162x over existing solutions (thereby decreasing overall training time by up to 66%), and performance results from benchmarks and production. an array within a map, within a union, etc…). Default is 128 * 1024 (128KB).

Datasets 102
article thumbnail

Memory Optimizations for Analytic Queries in Cloudera Data Warehouse

Cloudera

Impala has always focused on efficiency and speed, being written in C++ and effectively using techniques such as runtime code generation and multithreading. For instance, in both the struct s above the largest member is a pointer of size 8 bytes. Total size of the Bucket is 16 bytes. Folding data into pointers.

article thumbnail

How to Stream JSON Data Using Server-Sent Events and FastAPI in Python over HTTP?

Workfall

Reading Time: 9 minutes In this blog, we will cover: What are Server-Sent Events? Fast to code : It enables considerable speed increases in development. Short : It reduces code duplication. Robust : It delivers production-ready code with interactive documentation. Then, we’ll dive into the server-side code.

Python 85
article thumbnail

Deploying Kafka Streams and KSQL with Gradle – Part 2: Managing KSQL Implementations

Confluent

We’ll demonstrate using Gradle to execute and test our KSQL streaming code, as well as building and deploying our KSQL applications in a continuous fashion. The first requirement to tackle: how to express dependencies between KSQL queries that exist in script files in a source code repository. Sample repository. gradlew composeUp.

Kafka 96