Sat.Mar 27, 2021 - Fri.Apr 02, 2021

article thumbnail

Apache Kafka Made Simple: A First Glimpse of a Kafka Without ZooKeeper

Confluent

At the heart of Apache Kafka® sits the log—a simple data structure that uses sequential operations that work symbiotically with the underlying hardware. Efficient disk buffering and CPU cache usage, […].

Kafka 145
article thumbnail

Cloudera Data Platform extends Hybrid Cloud vision support by supporting Google Cloud

Cloudera

CDP Public Cloud is now available on Google Cloud. The addition of support for Google Cloud enables Cloudera to deliver on its promise to offer its enterprise data platform at a global scale. CDP Public Cloud is already available on Amazon Web Services and Microsoft Azure. With the addition of Google Cloud, we deliver on our vision of providing a hybrid and multi-cloud architecture to support our customer’s analytics needs regardless of deployment platform. .

article thumbnail

Data Quality Management For The Whole Team With Soda Data

Data Engineering Podcast

Summary Data quality is on the top of everyone’s mind recently, but getting it right is as challenging as ever. One of the contributing factors is the number of people who are involved in the process and the potential impact on the business if something goes wrong. In this episode Maarten Masschelein and Tom Baeyens share the work they are doing at Soda to bring everyone on board to make your data clean and reliable.

article thumbnail

How to trigger a spark job from AWS Lambda

Start Data Engineering

Event driven pipelines Lambda function to trigger spark jobs Setup and run Monitoring and logging Teardown Conclusion Further reading References Event driven pipelines Event driven systems represent a software design pattern where a logic is executed in response to an event. This event can be a file creation on S3, a new database row, API call, etc.

AWS 100
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Monitoring Your Event Streams: Integrating Confluent with Prometheus and Grafana

Confluent

Self-managing a highly scalable distributed system with Apache Kafka® at its core is not an easy feat. That’s why operators prefer tooling such as Confluent Control Center for administering and […].

Kafka 131
article thumbnail

Hybrid Cloud and Strategic Data Use Accelerate State, Army Missions

Cloudera

Some of the most forward-operational elements of the United States federal government are making strides in leveraging data through hybrid cloud environments—and they’re constantly evaluating progress and recalibrating their approaches along the way. At agencies including the Army and the State Department, work is well underway to find ways of employing emerging technologies that build on cloud services and data optimization to realize new levels of effectiveness.

Cloud 77

More Trending

article thumbnail

5 Reasons Data Discovery Platforms Are Best For Data Lakes

Monte Carlo

Over the past few years, data lakes have emerged as a must-have for the modern data stack. But while the technologies powering our access and analysis of data have matured, the mechanics behind understanding this data in a distributed environment have lagged behind. Here’s where data catalogs fall short and how data discovery platforms and tools can help ensure your data lake doesn’t turn into a data swamp.

article thumbnail

How to Develop & Deliver Better Analytics

DataKitchen

Deep in Data host Robert Lacok interviews DataKitchen CEO Chris Bergh about what companies get wrong when building a data culture, DataOps success stories, and building a data product. The post How to Develop & Deliver Better Analytics first appeared on DataKitchen.

article thumbnail

Reverse ETL with dbt and Grouparoo

Grouparoo

Teams are centralizing their data in their data warehouse by loading data in and transforming it as necessary. Increasingly, we are seeing teams turn to dbt to do this transforming. The idea is to write *.sql files that, when run in the right order, create useful rollup tables or materialized views of the data. We've been asked by teams using dbt how Grouparoo can then sync their data to their cloud-based apps.

article thumbnail

CFO Analytics – Driving Value Through Analytics Automation

Teradata

Untimely & inaccurate financial analysis can lead to incorrect & costly business decisions. Users want accurate information now and automation is a key driver.

52
article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Software Security at Rocketship Pace

Afterpay Tech

Photo by Yancy Min on Unsplash By: Alex Rosenzweig Overview Helping our engineers build amazing products that are worthy of our customer’s trust is the job of our product security team. One of the tools in the product security team’s arsenal is our code scanning platform which we lovingly call “Intersect” Effective code scanning is a core building block of a modern security program.

Coding 52
article thumbnail

DataOps Is Not Just DevOps for Data

DataKitchen

Building the Backend host Travis Lawrence interviews DataKitchen CEO Chris Bergh on three value bombs: (1) DataOps is not just DevOps for data; (2) Architect for change; & (3) CDOs should be on the offense. The post DataOps Is Not Just DevOps for Data first appeared on DataKitchen.

Data 52
article thumbnail

Akka Typed: Adapting Messages Explained

Rock the JVM

Learn how to organize code, messages, domains, and logic in an Akka application with Scala

Scala 52
article thumbnail

We Stand as One with Asian American and Pacific Islander Communities

Teradata

At Teradata, we're standing together with AAPI communities & commit to take action. It is our responsibility to create a change for the future - a world where everyone experiences safety & equality.

IT 52
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

How to Make a Data Science Career Transition in 2023?

ProjectPro

Over the last decade, there has been an explosion of interest in data science careers, and the number of people looking to make a career in data science. The COVID-19 pandemic crisis caused millions to lose their jobs and forced people to rethink career transition. One question that our Project Advisors often get from nearly everyone interested in pursuing a data science career – whether they are just getting started or looking to transition into the most in-demand data science job market

article thumbnail

A Career & Passion for DataOps

DataKitchen

Customer Insights Leader podcast host Paul Laughlin interviews DataKitchen CEO Chris Bergh about his career & passion for DataOps, how DataOps teams can measure success, & skills analysts should develop to succeed with DataOps. The post A Career & Passion for DataOps first appeared on DataKitchen.

52
article thumbnail

Why Apache ECharts is the Future of Apache Superset™

Preset

Apache ECharts is a top open-source data visualization framework. In this post, we'll explore why the Apache Superset™ community is betting on ECharts.

Data 40
article thumbnail

Case Study: Ritual’s Move to Real-Time Analytics to Personalize the Multivitamin Experience

Rockset

Ritual is a health-meets-technology company reimagining the products we use every day, starting with the multivitamin. With an in-house team of scientists and researchers, Ritual invests in creating high-quality, science-backed multivitamins formulated to help fill common nutrient gaps in diets at different life stages and making those nutrients readily traceable though the first visible supply chain of its kind.* Taking an evidence-based approach, Ritual has designed multivitamins with key high

Food 40
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

PyTorch vs TensorFlow 2023-A Head-to-Head Comparison

ProjectPro

‘Man and machine together can be better than the human’ All thanks to deep learning frameworks like PyTorch, Tensorflow, Keras, Caffe, and DeepLearning4j for making machines learn like humans with special brain-like architectures known as Neural Networks. The war of deep learning frameworks has two prominent competitors- PyTorch vs Tensorflow because the other frameworks have not yet been adopted widely.

article thumbnail

How the Open Edge Is Driving Digital Transformation

DataKitchen

The post How the Open Edge Is Driving Digital Transformation first appeared on DataKitchen.

52
article thumbnail

Accelerated integration of Eventador with Cloudera – SQL Stream Builder

Cloudera

In October 2020, Cloudera made a strategic acquisition of a company called Eventador. This was primarily to augment our streaming capabilities within Cloudera DataFlow. Eventador was adept at simplifying the process of building streaming applications. Their flagship product, SQL Stream Builder, made access to real-time data streams easily possible with just SQL (Structured Query Language).

SQL 118
article thumbnail

Data.What? Data Democratization and the Illusion of Self-Service

Teradata

The concepts and processes surrounding self-service analytics sound easy. So why does the illusion of self-service rarely translate to reality? Find out more.

Data 59
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Building a Better Data Culture: An Interview with ThoughtSpot’s Cindi Howson

Monte Carlo

We sat down with C indi Howson , Chief Data Strategy Officer at T houghtSpot , the leading search and AI-driven analytics platform, for a wide-ranging conversation about her daily work, common challenges organizations face on the road to data democratization, and diversity in data science. Over the past few decades, the world of data analytics has undergone transformation from a siloed entity to a cross-functional powerhouse.

article thumbnail

Improving Collaboration with a DataOps Platform

DataKitchen

The post Improving Collaboration with a DataOps Platform first appeared on DataKitchen.

52
article thumbnail

An Introduction to DataOps

DataKitchen

DATA MASTERS hosts Mark Marinelli & Nate Nelson interview DataKitchen CEO Chris Bergh. The post An Introduction to DataOps first appeared on DataKitchen.

Data 40
article thumbnail

The Intersection of Data Democratization and Security

DataKitchen

The post The Intersection of Data Democratization and Security first appeared on DataKitchen.

Data 52
article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.