Top Data Engineering Digest Kafka Cloud Content for Week of Jun 13

Sat.Jun 13, 2020 - Fri.Jun 19, 2020

3 Key techniques, to optimize your Apache Spark code

Start Data Engineering

JUNE 19, 2020

Intro A lot of tutorials show how to write spark code with just the API and code samples, but they do not explain how to write “efficient Apache Spark” code.

Coding

Coding IT Data

Business Intelligence meets Data Engineering with Emerging Technologies

Simon Späti

JUNE 14, 2020

Today we have more requirements with ever-growing tools and framework, complex cloud architectures, and with data stack that is changing rapidly. I hear claims: “Business Intelligence (BI) takes too long to integrate new data”, or “understanding how the numbers match up is very hard and needs lots of analysis”. The goal of this article is to make business intelligence easier, faster and more accessible with techniques from the sphere of data engineering.

Business Intelligence

Business Intelligence Data Engineer Data Engineering Technology

Join 37,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Airflow Best Practices for ETL/ELT Pipelines

MORE WEBINARS

Trending Sources

The Cost of Apache Kafka: An Engineer’s Guide to Pricing Out DIY Operations

Confluent

JUNE 19, 2020

When I have a small software project that I want to share with the world, I don’t write my own version control system with a web UI. I don’t even […].

Kafka

Kafka Engineering Project Systems

Webinars

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Airflow Best Practices for ETL/ELT Pipelines

MORE WEBINARS

There Are No Perfect Words…

Teradata

JUNE 17, 2020

Juneteenth has been declared a U.S. holiday at Teradata, as we stand with the black community and reflect on what we can do to fight racism and injustice, and embrace diversity.

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Speaker: Jason Chester, Director, Product Management

In today’s manufacturing landscape, staying competitive means moving beyond reactive quality checks and toward real-time, data-driven process control. But what does true manufacturing process optimization look like—and why is it more urgent now than ever? Join Jason Chester in this new, thought-provoking session on how modern manufacturers are rethinking quality operations from the ground up.

Manufacturing

Accelerate Your Machine Learning With The StreamSQL Feature Store

Data Engineering Podcast

JUNE 15, 2020

Summary Machine learning is a process driven by iteration and experimentation which requires fast and easy access to relevant features of the data being processed. In order to reduce friction in the process of developing and delivering models there has been a recent trend toward building a dedicated feature. In this episode Simba Khadder discusses his work at StreamSQL building a feature store to make creation, discovery, and monitoring of features fast and easy to manage.

Machine Learning

Machine Learning Google Cloud Kafka Data Engineer

Understanding Azure Synapse Analytics

Advancing Analytics: Data Engineering

JUNE 16, 2020

You might have seen that I’ve been pretty busy recently, digging into the new Azure Synapse Analytics preview, announced back at Microsoft Build 2020. I’ve explored the spark engine, SQL serverless/On-Demand and various other bits… but I’m still getting the same question of “Cool!…. but what actually is it?”. One of the problems here is that Azure SQL Data Warehouse was rebranded as “Azure Synapse Analytics”… but it’s not the same as the full workspace.

SQL

SQL Data Warehouse Engineering Data Engineer

Project Metamorphosis Month 2: Cost-Effective Apache Kafka for Use Cases Big and Small

Confluent

JUNE 17, 2020

In April, we kicked off Project Metamorphosis. Project Metamorphosis is an effort to bring the simplicity of best of breed cloud systems to the world of event streaming. It is […].

Project

Project Kafka Cloud Systems

More Trending

Project Metamorphosis Month 2: Cost-Effective Apache Kafka for Use Cases Big and Small

Confluent

JUNE 17, 2020

In April, we kicked off Project Metamorphosis. Project Metamorphosis is an effort to bring the simplicity of best of breed cloud systems to the world of event streaming. It is […].

Project

Project Kafka Cloud Systems

AWS First-Party Service Integration with Teradata Vantage

Teradata

JUNE 16, 2020

Integration with AWS first-party services gives our enterprise customers as much cloud-native functionality as they want for their Vantage environments. Learn more.

AWS

AWS Cloud

Comparing Akka Streams, Kafka Streams and Spark Streaming

Rock the JVM

JUNE 16, 2020

Explore how Akka Streams, Kafka Streams, and Spark Streaming stack up and find out which one is best for your use case

Kafka

JOINs and Aggregations Using Real-Time Indexing on MongoDB Atlas

Rockset

JUNE 16, 2020

MongoDB.live took place last week, and Rockset had the opportunity to participate alongside members of the MongoDB community and share about our work to make MongoDB data accessible via real-time external indexing. In our session, we discussed the need for modern data-driven applications to perform real-time aggregations and joins, and how Rockset uses MongoDB change streams and Converged Indexing to deliver fast queries on data from MongoDB.

MongoDB

MongoDB Data Lake PostgreSQL Kafka

Build Real-Time Observability Pipelines with Confluent Cloud and AppDynamics

Confluent

JUNE 18, 2020

Many organisations rely on commercial or open source monitoring tools to measure the performance and stability of business-critical applications. AppDynamics, Datadog, and Prometheus are widely used commercial and open source […].

Cloud

Cloud Building Kafka

Airflow Best Practices for ETL/ELT Pipelines

Speaker: Kenten Danas, Senior Manager, Developer Relations

ETL and ELT are some of the most common data engineering use cases, but can come with challenges like scaling, connectivity to other systems, and dynamically adapting to changing data sources. Airflow is specifically designed for moving and transforming data in ETL/ELT pipelines, and new features in Airflow 3.0 like assets, backfills, and event-driven scheduling make orchestrating ETL/ELT pipelines easier than ever!

Data Engineering

Microsoft Azure First-Party Service Integration with Teradata Vantage

Teradata

JUNE 18, 2020

Integration with Azure first-party services enables Vantage users to tap into new sources of innovation across all aspects of the analytic process from start to finish.

Process

Your Essential dbt Project Checklist

dbt Developer Hub

JUNE 19, 2020

If you’ve been using dbt for over a year, your project is out-of-date. This is natural. New functionalities have been released. Warehouses change. Best practices are updated. Over the last year, I and others on the Fishtown Analytics (now dbt Labs!) team have conducted seven audits for clients who have been using dbt for a minimum of 2 months. In every single audit, we found opportunities to: Improve performance Improve maintainability Make it easier for new people to get up-to-speed on the proj

Project

Project BI SQL Utilities

Lloyds Banking Group

Teradata

JUNE 17, 2020

Lloyds Banking Group executes analytic projects that benefit the customer journey for multiple brands within Lloyds Banking Group.

Banking

Banking Project

Intelligent Analytics for Telcos Using Teradata Vantage

Teradata

JUNE 14, 2020

Learn how leveraging Machine Learning for advanced analytics enables Telcos to tackle problems from identifying network anomalies to customer churn. Read more.

Machine Learning

Whats New in Apache Airflow 3.0 –– And How Will It Reshape Your Data Workflows?

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

Data Workflow

Sat.Jun 13, 2020 - Fri.Jun 19, 2020

3 Key techniques, to optimize your Apache Spark code

Business Intelligence meets Data Engineering with Emerging Technologies

Webinars

Trending Sources

The Cost of Apache Kafka: An Engineer’s Guide to Pricing Out DIY Operations

Webinars

There Are No Perfect Words…

Precision in Motion: Why Process Optimization Is the Future of Manufacturing

Accelerate Your Machine Learning With The StreamSQL Feature Store

Understanding Azure Synapse Analytics

Project Metamorphosis Month 2: Cost-Effective Apache Kafka for Use Cases Big and Small

Sign up to get articles personalized to your interests!

More Trending

Project Metamorphosis Month 2: Cost-Effective Apache Kafka for Use Cases Big and Small

AWS First-Party Service Integration with Teradata Vantage

Comparing Akka Streams, Kafka Streams and Spark Streaming

JOINs and Aggregations Using Real-Time Indexing on MongoDB Atlas

Build Real-Time Observability Pipelines with Confluent Cloud and AppDynamics

Airflow Best Practices for ETL/ELT Pipelines

Microsoft Azure First-Party Service Integration with Teradata Vantage

Your Essential dbt Project Checklist

Lloyds Banking Group

Intelligent Analytics for Telcos Using Teradata Vantage

Whats New in Apache Airflow 3.0 –– And How Will It Reshape Your Data Workflows?

Stay Connected