Sat.Mar 28, 2020 - Fri.Apr 03, 2020

article thumbnail

How to process simple data stream and consume with Lambda

Team Data Science

I built a serverless architecture for my simulated credit card complaints stream using, AWS S3 AWS Lambda AWS Kinesis the above picture gives a high-level view of the data flow. I assume uploading the CSV file as a data producer, so once you upload a file, it generates object created event and the Lambda function is invoked asynchronously. The file data content will be written to the Kinesis stream as a record (record = data + partition key), which triggers another Lambda function and persist th

Process 130
article thumbnail

Scheduling a SQL script, using Apache Airflow, with an example

Start Data Engineering

One of the most common use cases for Apache Airflow is to run scheduled SQL scripts. Developers who start with Airflow often ask the following questions “How to use airflow to orchestrate sql?

SQL 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

The Life Of A Non-Profit Data Professional

Data Engineering Podcast

Summary Building and maintaining a system that integrates and analyzes all of the data for your organization is a complex endeavor. Operating on a shoe-string budget makes it even more challenging. In this episode Tyler Colby shares his experiences working as a data professional in the non-profit sector. From managing Salesforce data models to wrangling a multitude of data sources and compliance challenges, he describes the biggest challenges that he is facing.

AWS 100
article thumbnail

Bringing 4K and HDR to Anime at Netflix with Sol Levante

Netflix Tech

By Haruka Miyagawa & Kylee Peña Continue reading on Netflix TechBlog ».

97
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

Stable, Secure Apache Kafka as a Service – A Cloud Provider’s Tale

Confluent

Running fully managed Apache Kafka® as a service brings many responsibilities that leading cloud providers hide well. There is a reason why cloud services? are so popular right now—?companies realize […].

Kafka 94
article thumbnail

People, We Need to Talk About Mass Electronic Surveillance

Teradata

With the COVID-19 epidemic in full swing, the countries that are faring the best are employing large-scale testing and electronic surveillance. But what does this mean for our civil liberties?

More Trending

article thumbnail

Case Study: Fleet Management System – An End-to-End Streaming Data Pipeline

Rockset

PROBLEM STATEMENT: Fleet operators often suffer business and monetary losses due to a lack of information on the health of their fleet and inventory it carries. This problem arises due to a lack of real-time data on vehicle health or inventory health, to take preemptive action or real-time action. EXAMPLES: A vehicle’s coolant is leaking and engine temperature is going up.

article thumbnail

Real-Time Data Replication with ksqlDB

Confluent

Data can originate in a number of different sources—transactional databases, mobile applications, external integrations, one-time scripts, etc.—but eventually it has to be synchronized to a central data warehouse for analysis […].

article thumbnail

How Bayes' Theorem Helps Prediction Analytics in Teradata Vantage

Teradata

Learn about the underlying concept of Bayes' Theorem which is the foundation for the Naive Bayes algorithm in Vantage, a powerful tool in predicting the outcome of business or healthcare related events.

article thumbnail

Self-Types in Scala: A Quick Overview

Rock the JVM

Self-types are a powerful way to enforce type constraints in Scala: learn to use them effectively in just a few minutes

Scala 52
article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Sending HTTP Requests with Scala and Akka in 5 Minutes

Rock the JVM

Learn to use Akka HTTP with Scala and send HTTP requests in just a few minutes with the Akka HTTP server DSL

Scala 52
article thumbnail

Why Is Contravariance So Hard in Scala?

Rock the JVM

Unravel the complexities of Scala's powerful type system with our deep dive into contravariance: we simplify and demystify its challenging aspects

Scala 52
article thumbnail

Why Is Contravariance So Hard in Scala?

Rock the JVM

Unravel the complexities of Scala's powerful type system with our deep dive into contravariance: we simplify and demystify its challenging aspects

Scala 52
article thumbnail

8 Pattern Matching Tricks in Scala

Rock the JVM

Pattern matching is one of the most powerful Scala features: learn to use it to the max and write your best Scala code

Scala 52
article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.