Sat.Mar 28, 2020 - Fri.Apr 03, 2020

article thumbnail

How to process simple data stream and consume with Lambda

Team Data Science

I built a serverless architecture for my simulated credit card complaints stream using, AWS S3 AWS Lambda AWS Kinesis the above picture gives a high-level view of the data flow. I assume uploading the CSV file as a data producer, so once you upload a file, it generates object created event and the Lambda function is invoked asynchronously. The file data content will be written to the Kinesis stream as a record (record = data + partition key), which triggers another Lambda function and persist th

Process 130
article thumbnail

Scheduling a SQL script, using Apache Airflow, with an example

Start Data Engineering

One of the most common use cases for Apache Airflow is to run scheduled SQL scripts. Developers who start with Airflow often ask the following questions “How to use airflow to orchestrate sql?

SQL 130
article thumbnail

The Life Of A Non-Profit Data Professional

Data Engineering Podcast

Summary Building and maintaining a system that integrates and analyzes all of the data for your organization is a complex endeavor. Operating on a shoe-string budget makes it even more challenging. In this episode Tyler Colby shares his experiences working as a data professional in the non-profit sector. From managing Salesforce data models to wrangling a multitude of data sources and compliance challenges, he describes the biggest challenges that he is facing.

AWS 100
article thumbnail

Bringing 4K and HDR to Anime at Netflix with Sol Levante

Netflix Tech

By Haruka Miyagawa & Kylee Peña Continue reading on Netflix TechBlog ».

97
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Stable, Secure Apache Kafka as a Service – A Cloud Provider’s Tale

Confluent

Running fully managed Apache Kafka® as a service brings many responsibilities that leading cloud providers hide well. There is a reason why cloud services? are so popular right now—?companies realize […].

Kafka 97
article thumbnail

People, We Need to Talk About Mass Electronic Surveillance

Teradata

With the COVID-19 epidemic in full swing, the countries that are faring the best are employing large-scale testing and electronic surveillance. But what does this mean for our civil liberties?

More Trending

article thumbnail

Case Study: Fleet Management System – An End-to-End Streaming Data Pipeline

Rockset

PROBLEM STATEMENT: Fleet operators often suffer business and monetary losses due to a lack of information on the health of their fleet and inventory it carries. This problem arises due to a lack of real-time data on vehicle health or inventory health, to take preemptive action or real-time action. EXAMPLES: A vehicle’s coolant is leaking and engine temperature is going up.

article thumbnail

Real-Time Data Replication with ksqlDB

Confluent

Data can originate in a number of different sources—transactional databases, mobile applications, external integrations, one-time scripts, etc.—but eventually it has to be synchronized to a central data warehouse for analysis […].

article thumbnail

How Bayes' Theorem Helps Prediction Analytics in Teradata Vantage

Teradata

Learn about the underlying concept of Bayes' Theorem which is the foundation for the Naive Bayes algorithm in Vantage, a powerful tool in predicting the outcome of business or healthcare related events.

article thumbnail

Self-Types in Scala: A Quick Overview

Rock the JVM

Self-types are a powerful way to enforce type constraints in Scala: learn to use them effectively in just a few minutes

Scala 52
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Sending HTTP Requests with Scala and Akka in 5 Minutes

Rock the JVM

Learn to use Akka HTTP with Scala and send HTTP requests in just a few minutes with the Akka HTTP server DSL

Scala 52
article thumbnail

Why Is Contravariance So Hard in Scala?

Rock the JVM

Unravel the complexities of Scala's powerful type system with our deep dive into contravariance: we simplify and demystify its challenging aspects

Scala 52
article thumbnail

Why Is Contravariance So Hard in Scala?

Rock the JVM

Unravel the complexities of Scala's powerful type system with our deep dive into contravariance: we simplify and demystify its challenging aspects

Scala 52
article thumbnail

8 Pattern Matching Tricks in Scala

Rock the JVM

Pattern matching is one of the most powerful Scala features: learn to use it to the max and write your best Scala code

Scala 52
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.