Sat.Feb 16, 2019 - Fri.Feb 22, 2019

article thumbnail

Speed Up Your Analytics With The Alluxio Distributed Storage System

Data Engineering Podcast

Summary Distributed storage systems are the foundational layer of any big data stack. There are a variety of implementations which support different specialized use cases and come with associated tradeoffs. Alluxio is a distributed virtual filesystem which integrates with multiple persistent storage systems to provide a scalable, in-memory storage layer for scaling computational workloads independent of the size of your data.

Systems 100
article thumbnail

Extending Vector with eBPF to inspect host and container performance

Netflix Tech

by Jason Koch , with Martin Spier , Brendan Gregg , Ed Hunter Improving the tools available to our engineers to help them diagnose, triage, and work through software performance challenges in the cloud is a key goal for the cloud performance engineering team at Netflix. Today we are excited to announce latency heatmaps and improved container support for our on-host monitoring solution?

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Sysmon Security Event Processing in Real Time with KSQL and HELK

Confluent

During a recent talk titled Hunters ATT&CKing with the Right Data , which I presented with my brother Jose Luis Rodriguez at ATT&CKcon, we talked about the importance of documenting and modeling security event logs before developing any data analytics while preparing for a threat hunting engagement. Defining relationships among Windows security event logs such as Sysmon , for example, helped us to appreciate the extra context that two or more events together can provide for a hunt.

Process 82
article thumbnail

How to Run SQL on PDF Files

Rockset

PDFs are the de facto standard for distributing and sharing fixed-layout documents today. A quick survey of my laptop folders reveals account statements, receipts, technical papers, book chapters, and presentation slides—all PDFs. Lots of valuable information finds its way into all manner of PDF files. Which is a great reason for Rockset to support SQL queries on PDF files, in our mission to make data more usable to everyone.

SQL 52
article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

A Journey On End To End Testing A Microservices Architecture

Zalando Engineering

End to end testing is a testing technique used to test the flow of an application through a business transaction. In microservices architecture there are different components working together to enable a business capability, therefore testing all of them can get tricky. In this article you can read about our team’s journey: What our system looks like What do you get from e2e testing?

article thumbnail

It's the Relationship - Not Just the Data - That is Critical to Success

Teradata

Rob Armstrong explains that while data is important, the real key is preserving the relationships across the data models that leads to insight and successful business outcomes.

IT 40

More Trending

article thumbnail

Using Smart Schema to Accelerate Insights from Nested JSON

Rockset

Developers often need to work with datasets without a fixed schema, like heavily nested JSON data with several deeply nested arrays and objects, mixed data types, null values, and missing fields. In addition, the shape of the data is prone to change when continuously syncing new data. Understanding the shape of a dataset is crucial to constructing complex queries for building applications or performing data science investigations.

article thumbnail

Kafka Summit 2019: 3 Big Things!

Confluent

How many Kafka Summits should there be in a year? Experts disagree. Some say there should be one giant event where everybody gathers at once. Some say there should be one once a month in different regions of the world. Others say you should live every day like it’s Kafka Summit. As you may know, we have adopted a happy medium: three Summits in 2019.

Kafka 64
article thumbnail

The Utah Jazz Uses Pervasive Data Intelligence for Next Generation Sports Analytics

Teradata

Larry H. Miller is using data and analytics to successfully increase customer satisfaction from a multitude of data sources and customer touchpoints.

Data 40
article thumbnail

The Utah Jazz Uses Pervasive Data Intelligence for Next Generation Sports Analytics

Teradata

Larry H. Miller is using data and analytics to successfully increase customer satisfaction from a multitude of data sources and customer touchpoints.

Data 40
article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.