Bytes, Data Collection and Kafka - Data Engineering Digest

Streaming Data from the Universe with Apache Kafka

Confluent

JUNE 13, 2019

You might think that data collection in astronomy consists of a lone astronomer pointing a telescope at a single object in a static sky. While that may be true in some cases (I collected the data for my Ph.D. thesis this way), the field of astronomy is rapidly changing into a data-intensive science with real-time needs.

Kafka

Kafka Bytes Data Pipeline Transportation

A Beginners Guide to Spark Streaming Architecture with Example

ProjectPro

JUNE 6, 2025

Apache Spark Streaming Use Cases Spark Streaming Architecture: Discretized Streams Spark Streaming Example in Java Spark Streaming vs. Structured Streaming Spark Streaming Structured Streaming What is Kafka Streaming? Kafka Stream vs. Spark Streaming What is Spark streaming? live logs, IoT device data, system telemetry data, etc.)

Architecture

Architecture Kafka Java Scala

50 PySpark Interview Questions and Answers For 2025

ProjectPro

JUNE 6, 2025

These DStreams allow developers to cache data in memory, which may be particularly handy if the data from a DStream is utilized several times. The cache() function or the persist() method with proper persistence settings can be used to cache data. ’ A DataFrame is an immutable distributed columnar data collection.

Hadoop

Hadoop Metadata Java Datasets

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

How to Become a Big Data Engineer in 2025

ProjectPro

JUNE 6, 2025

Becoming a Big Data Engineer - The Next Steps Big Data Engineer - The Market Demand An organization’s data science capabilities require data warehousing and mining, modeling, data infrastructure, and metadata management. Most of these are performed by Data Engineers.

Big Data

Big Data Data Engineering Data Engineer Engineering

BPFAgent: eBPF for Monitoring at DoorDash

DoorDash Engineering

AUGUST 15, 2023

Here we will take a look at how we built BPFAgent, the process of building and maintaining its probes, and how various DoorDash teams have used the data collected. We also have an unmarshalling function to convert the raw bytes from the kernel into our structure. struct sock *sk = (struct sock *)PT_REGS_PARM1(ctx); if (!sk)

Bytes

Bytes PostgreSQL Coding Database

A Beginners Guide to Spark Streaming Architecture with Example

ProjectPro

DECEMBER 28, 2021

Apache Spark Streaming Use Cases Spark Streaming Architecture: Discretized Streams Spark Streaming Example in Java Spark Streaming vs. Structured Streaming Spark Streaming Structured Streaming What is Kafka Streaming? Kafka Stream vs. Spark Streaming What is Spark streaming? live logs, IoT device data, system telemetry data, etc.)

Architecture

Architecture Kafka Java Scala

Forge Your Career Path with Best Data Engineering Certifications

ProjectPro

JUNE 6, 2025

The World Economic Forum predicts that by 2025, 463 exabytes of data will be produced daily across the world. Exabytes are 10006 bytes, so to put it into perspective, 463 exabytes is the same as 212,765,957 DVDs. Expertise in creating scalable and efficient data processing architectures and also, monitor data processing systems.

Certification

Certification Data Engineering Data Engineer Engineering

A Gentle Introduction to Analytical Stream Processing

Towards Data Science

APRIL 3, 2023

Your event data exists as a complete idea, or as partial ideas or thoughts. I have found that thinking of data as a story over time helps to give life to these bytes of data. Consider this simple truth. Just use the app to redeem”.

Process

Process Data Lake Bytes Systems

97 things every data engineer should know

Grouparoo

OCTOBER 6, 2021

13 Column Names as Contracts Standardize columns names to minimize confusion 14 Consensual, Privacy-Aware Data Collection At some point does Grouparoo get properties noted as PII and what it means for a profile to opt out? 15 Cultivate Good Working Relationships with Data Consumers Practice empathy 16 Data Engineering !

Data Engineering

Data Engineering Data Engineer Engineering Pipeline-centric

How to Become a Big Data Engineer in 2023

ProjectPro

SEPTEMBER 26, 2021

Becoming a Big Data Engineer - The Next Steps Big Data Engineer - The Market Demand An organization’s data science capabilities require data warehousing and mining, modeling, data infrastructure, and metadata management. Most of these are performed by Data Engineers.

Big Data

Big Data Data Engineering Data Engineer Engineering

Forge Your Career Path with Best Data Engineering Certifications

ProjectPro

FEBRUARY 21, 2023

This blog covers the most valuable data engineering certifications worth paying attention to in 2023 if you plan to land a successful job in the data engineering domain. Why Are Data Engineering Skills In Demand? The World Economic Forum predicts that by 2025, 463 exabytes of data will be produced daily across the world.

Certification

Certification Data Engineering Data Engineer Engineering

50 PySpark Interview Questions and Answers For 2023

ProjectPro

NOVEMBER 22, 2021

These DStreams allow developers to cache data in memory, which may be particularly handy if the data from a DStream is utilized several times. The cache() function or the persist() method with proper persistence settings can be used to cache data. You can learn a lot by utilizing PySpark for data intake processes.

Hadoop

Hadoop Metadata Java SQL

Data Engineering Digest

Streaming Data from the Universe with Apache Kafka

A Beginners Guide to Spark Streaming Architecture with Example

Webinars

Trending Sources

50 PySpark Interview Questions and Answers For 2025

Webinars

How to Become a Big Data Engineer in 2025

BPFAgent: eBPF for Monitoring at DoorDash

A Beginners Guide to Spark Streaming Architecture with Example

Forge Your Career Path with Best Data Engineering Certifications

A Gentle Introduction to Analytical Stream Processing

97 things every data engineer should know

How to Become a Big Data Engineer in 2023

Top 100 Hadoop Interview Questions and Answers 2025

Forge Your Career Path with Best Data Engineering Certifications

50 PySpark Interview Questions and Answers For 2023

Top 100 Hadoop Interview Questions and Answers 2023

Stay Connected