Sat.Feb 06, 2021 - Fri.Feb 12, 2021

article thumbnail

Node.js ❤️ Apache Kafka – Getting Started with KafkaJS

Confluent

One of the great things about using an Apache Kafka® based architecture is that it naturally decouples systems and allows you to use the best tool for the job. While […].

Kafka 145
article thumbnail

How Shopify Is Building Their Production Data Warehouse Using DBT

Data Engineering Podcast

Summary With all of the tools and services available for building a data platform it can be difficult to separate the signal from the noise. One of the best ways to get a true understanding of how a technology works in practice is to hear from people who are running it in production. In this episode Zeeshan Qureshi and Michelle Ark share their experiences using DBT to manage the data warehouse for Shopify.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How to Join a fact and a type 2 dimension (SCD2) table

Start Data Engineering

Introduction What is an SCD2 table and why use it? Application table Dimension table Setup Joining fact and SCD2 tables high_spenders user_items Educating end users Conclusion Further reading Introduction If you are using a data warehouse, you would have heard of fact and dimension tables. Simply put, fact tables are used to record a business event and dimension tables are used to record the attributes of business items(eg user, item tables in an e-commerce app).

article thumbnail

Next Stop – Building a Data Pipeline from Edge to Insight

Cloudera

This is part 2 in this blog series. You can read part 1, here: Digital Transformation is a Data Journey From Edge to Insight. This blog series follows the manufacturing, operations and sales data for a connected vehicle manufacturer as the data goes through stages and transformations typically experienced in a large manufacturing company on the leading edge of current technology.

article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Introducing Confluent Platform 6.1

Confluent

We are pleased to announce the release of Confluent Platform 6.1. With this release, we are further simplifying management tasks for Apache Kafka® operators and providing even higher availability for […].

Kafka 142
article thumbnail

Is Your Data Holding You Back Instead of Driving You Forward?

Teradata

Everyone knows that data is vital for success in retail. But without a clear data strategy, retailers often eat up resources fighting small-scale battles, whilst gradually losing the war.

Retail 112

More Trending

article thumbnail

Data – the Octane Accelerating Intelligent Connected Vehicles

Cloudera

The digital revolution is making a deep impact on the automotive industry, offering practically unlimited possibilities for more efficient, convenient, and safe driving and travel experiences in connected vehicles. This revolution is just beginning to accelerate – in fact, according to a recent Applied Market Research study, the global connected car market was valued at $63.03 billion in 2019, and is projected to reach $225.16 billion by 2027, registering a CAGR of 17.1% from 2020 to 2027.

article thumbnail

How to Write a Connector for Kafka Connect – Deep Dive into Configuration Handling

Confluent

Kafka Connect is part of Apache Kafka®, providing streaming integration of external systems in and out of Kafka. There are a large number of existing connectors, and you can also […].

Kafka 83
article thumbnail

From Product Cycle to Digital Thread

Teradata

In order to survive, the auto industry needs to leverage 'digital threads’ that connect data from customers to dealers to products, & link R&D to production line & the aftermarket.

Data 69
article thumbnail

Edge Authentication and Token-Agnostic Identity Propagation

Netflix Tech

by AIM Team Members Karen Casella , Travis Nelson , Sunny Singh ; with prior art and contributions by Justin Ryan , Satyajit Thadeshwar As most developers can attest, dealing with security protocols and identity tokens, as well as user and device authentication, can be challenging. Imagine having multiple protocols, multiple tokens, 200M+ users, and thousands of device types, and the problem can explode in scope.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

Cloudera Operational Database application development concepts

Cloudera

Cloudera Operational Database is now available in three different form-factors in Cloudera Data Platform (CDP). . If you are new to Cloudera Operational Database, see this blog post. And, check out the documentation here. . In this blog post, we’ll look at both Apache HBase and Apache Phoenix concepts relevant to developing applications for Cloudera Operational Database.

Database 104
article thumbnail

Automatic Observer Promotion Brings Fast and Safe Multi-Datacenter Failover with Confluent Platform 6.1

Confluent

Persisting data in multiple regions has become crucial for modern businesses: They need their mission-critical data to be protected from accidents and disasters. They can achieve this goal by running […].

Data 59
article thumbnail

Monte Carlo Raises $25M Series B to Help Companies Achieve More Reliable Data

Monte Carlo

In 2021, data is your company’s most critical asset. As data pipelines become increasingly complex and companies ingest more and more data, it’s paramount that this data is reliable. After talking to hundreds of data teams over the past few years, I was struck by the fact that organizations were investing millions of dollars and strategic energy in data, but decision makers and others on the frontlines couldn’t use it or didn’t trust it.

article thumbnail

Growth Engineering at Netflix?—?Automated Imagery Generation

Netflix Tech

Growth Engineering at Netflix?—?Automated Imagery Generation by Eric Eiswerth Background There’s a good chance you’ve probably visited the Netflix homepage. In the Growth Engineering team, we refer to this as the top of the signup funnel. For more background on the signup funnel and Growth Engineering’s role in the signup funnel, please read our initial post on the topic: Growth Engineering at Netflix?

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Coffee with Cloudera: Vinita Srivalsan

Cloudera

Meet Vinita Srivalsan, the powerhouse leader of the Partner Marketing team. Since this is Coffee with Cloudera, what’s your morning pick-me-up drink? I am a Chai person through and through and make it the traditional Indian way with milk and sugar! . What makes your role at Cloudera unique? . Partner Marketing is uniquely positioned to be the voice of Cloudera within a partner organization, and to represent the partner within Cloudera.

article thumbnail

How to Configure Your dbt Repository (One or Many)?

dbt Developer Hub

At dbt Labs, as more folks adopt dbt, we have started to see more and more use cases that push the boundaries of our established best practices. This is especially true to those adopting dbt in the enterprise space. After two years of helping companies from 20-10,000+ employees implement dbt & dbt Cloud, the below is my best attempt to answer the question: “Should I have one repository for my dbt project or many?

SQL 52
article thumbnail

Node.js Memory Error on Mac Using M1

Grouparoo

I was working with our fancy new CLI tool with my fancy new MacBook Pro with the M1 chip when I came across this scary error, courtesy of Node.js: FATAL ERROR: wasm code commit Allocation failed - process out of memory It began occurring regularly enough that I started digging. I've since come across two methods for solving this issue. Method #1: Upgrade to Node v15 I found this discussion which noted that Node.js versions prior to v15 do not natively support the Apple M1 chip.

Coding 52
article thumbnail

Growth Engineering at Netflix- Creating a Scalable Offers Platform

Netflix Tech

by Eric Eiswerth Background Netflix has been offering streaming video-on-demand (SVOD) for over 10 years. Throughout that time we’ve primarily relied on 3 plans (Basic, Standard, & Premium), combined with the 30-day free trial to drive global customer acquisition. The world has changed a lot in this time. Competition for people’s leisure time has increased, the device ecosystem has grown phenomenally, and consumers want to watch premium content whenever they want, wherever they are, and on w

article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

Using COD and CML to build applications that predict stock data

Cloudera

No, not really. You probably won’t be rich unless you work really hard… As nice as it would be, you can’t really predict a stock price based on ML solely, but now I have your attention! . Continuing from my previous blog post about how awesome and easy it is to develop web-based applications backed by Cloudera Operational Database (COD), I started a small project to integrate COD with another CDP cloud experience, Cloudera Machine Learning (CML). .

article thumbnail

What is Teradata Unity and Why Do You Need It?

Teradata

Learn more about Teradata Unity, a powerful portfolio for high availability and data synchronization in a Teradata-powered analytical ecosystem.

IT 52
article thumbnail

Data Observability: How to Build Your Own Data Anomaly Detectors Using SQL

Monte Carlo

In this article series, we walk through how you can create your own data observability monitors and data anomaly detectors from scratch, mapping to five key pillars of data health. Part I can be found here. Part II of this series was adapted from Barr Moses and Ryan Kearns’ O’Reilly training, Managing Data Downtime: Applying Observability to Your Data Pipelines , the industry’s first-ever course on data observability.

SQL 45
article thumbnail

Why Artificial Intelligence May Not Offer the Business Value You Think

DataKitchen

The post Why Artificial Intelligence May Not Offer the Business Value You Think first appeared on DataKitchen.

52
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

A Cost-Effective Data Warehouse Solution in CDP Public Cloud – Part1

Cloudera

Today’s customers have a growing need for a faster end to end data ingestion to meet the expected speed of insights and overall business demand. This ‘need for speed’ drives a rethink on building a more modern data warehouse solution, one that balances speed with platform cost management, performance, and reliability. A typical approach that we have seen in customers’ environments is that ETL applications pull data with a frequency of minutes and land it into HDFS storage as an extra Hive table

article thumbnail

7-Step Guide to Become a Machine Learning Engineer in 2023

ProjectPro

Spoiler Alert: Becoming a machine learning engineer can sound like a hard-to-reach goal but let us tell you the truth – it isn’t as hard as it seems. And yes, we’re talking to you - the person who’s reading this because they’re probably wondering what is a machine learning engineer, what does a machine learning engineer do, how to become a machine learning engineer , and, more importantly, whether they can pull it off.

article thumbnail

Find out what challenges Customer Conversion solves at Zalando

Zalando Engineering

When our Hiring Sprint kicks off next month, we will be looking for great professionals to join some of our stellar teams – Shopping Cart, Checkout, Sales Orders and Returns. All meaningful segments of our Customer Conversion organization, these teams are responsible for forging and shaping some of the most relevant experiences in Zalando customer journey.

Retail 40
article thumbnail

Indexing Amazon S3 for Real-Time Analytics on Data Lakes

Rockset

Amazon Simple Storage Service (Amazon S3) is one of the leading cloud object storage services available. It uses an HTTP interface, making it easy for application developers to integrate S3 into their applications. Athena is a serverless query service provided by Amazon to query the data stored in Amazon S3 using standard SQL. Because it integrates easily with S3, is serverless, and uses a familiar language, Athena has become the default service for most business intelligence (BI) decision maker

article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

#ClouderaLife Spotlight: Valaretha Brown, Sr. Partner Marketing Manager, ISV

Cloudera

Valaretha Brown (also known as Val) is Cloudera’s Sr. Partner Marketing Manager leading the strategy behind the go-to-market plans with our Independent Software Vendors. When she was young, she was always curious about corporate America. “My immediate family members received vocational school certificates and were hard working, blue collar workers.” This, along with her first job in fast food, helped her realize, “using my mind more than my hands to earn a living was right up my alley.” .

article thumbnail

Better Understand Your Geospatial Data - PostGIS GeoJSON

Preset

Apache Superset™ can visualize your geodata stored in Postgres | PostGIS GeoJSON

Data 52
article thumbnail

Differentiation Through DataOps in Financial Services

DataKitchen

The post Differentiation Through DataOps in Financial Services first appeared on DataKitchen.

52
article thumbnail

The agile manifesto : 20 years later

François Nguyen

Or Robert C Martin, this uncle you should pay a visit more often. Where was I 20 years ago at that time when these 17 brillant folks were in a ski station for the Agile Manifesto ? I was part of a small team with great individuals and in fact we were an alternative to IT unable to deliver what we wanted. So we are going to do it ourselves. Without knowing it, we were totally in that agile mindset : valuing interactions, working software, our collaborations with the users and be able to change be

article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.