Sat.Jun 04, 2022 - Fri.Jun 10, 2022

article thumbnail

An In-Depth Data Mesh Discussion with Zhamak Dehghani

Jesse Anderson

In 2021 I had the pleasure to first get to know and speak with Zhamak Dheghani, Director of Emerging Technologies at ThoughtWorks, in season one of the Data Dream Team series. Zhamak is a software engineer and architect who is (in)famously known as the founder of the data mesh concept, a paradigm shift in how we manage data-driven value at scale. I interviewed Zhamak last season as more of an introduction to Data Mesh.

article thumbnail

Learn MLOps with This Free Course

KDnuggets

Learn to train and track your experiments, create ML pipelines, model deployment, monitor the performance in production, and adopt best practices from DevOps.

159
159
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Bringing The Modern Data Stack To Everyone With Y42

Data Engineering Podcast

Summary Cloud services have made highly scalable and performant data platforms economical and manageable for data teams. However, they are still challenging to work with and manage for anyone who isn’t in a technical role. Hung Dang understood the need to make data more accessible to the entire organization and created Y42 as a better user experience on top of the "modern data stack" In this episode he shares how he designed the platform to support the full spectrum of technical ex

MongoDB 100
article thumbnail

The Future Is Hybrid Data, Embrace It

Cloudera

We live in a hybrid data world. In the past decade, the amount of structured data created, captured, copied, and consumed globally has grown from less than 1 ZB in 2011 to nearly 14 ZB in 2020. Impressive, but dwarfed by the amount of unstructured data, cloud data, and machine data – another 50 ZB. In fact, the total amount of data is expected to nearly triple by 2025.

IT 112
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

A Model Implementation

Teradata

How do you take the first steps to free the power of analytics from on-premise systems whilst protecting valuable data and de-risking transformation? Find out more.

Systems 85
article thumbnail

NLP, NLU, and NLG: What’s The Difference? A Comprehensive Guide

KDnuggets

This article aims to quickly cover the similarities and differences between NLP, NLU, and NLG and talk about what the future for NLP holds.

160
160

More Trending

article thumbnail

Cloudera’s Applied ML Prototype Catalog Continues to Grow

Cloudera

Here at Cloudera, we’re committed to helping make the lives of data practitioners as painless as possible. For data scientists, we continue to provide new Applied Machine Learning Prototypes (AMPs), which are open source and available on GitHub. These pre-built reference examples are complete end-to-end data science projects. In Cloudera Machine Learning (CML), you can deploy them with the single click of a button, bringing data scientists that much closer to providing value.

article thumbnail

How to Elastically Scale Apache Kafka Clusters on Confluent Cloud

Confluent

How to elastically scale Kafka clusters from 0 to 100 MB/s and back with automatic cluster resizing, data rebalancing, real-time consumption optimization, and monitoring in seconds.

Kafka 81
article thumbnail

Python: The programming language of machine learning

KDnuggets

You can't avoid learning Python if you work on machine learning problems. You need to know what other people's code means and you need to convey your ideas to them too.

article thumbnail

Scaling Appsec at Netflix (Part 2)

Netflix Tech

By Astha Singhal , Lakshmi Sudheer , Julia Knecht The Application Security teams at Netflix are responsible for securing the software footprint that we create to run the Netflix product, the Netflix studio, and the business. Our customers are product and engineering teams at Netflix that build these software services and platforms. The Netflix cultural values of ‘Context not Control’ and ‘Freedom and Responsibility’ strongly influence how we do Security at Netflix.

article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

#ClouderaLife Spotlight: Hassan Mirza

Cloudera

In this #ClouderaLife Spotlight Hassan talks about three life themes that have kept him moving and motivated: learning from his father’s work ethic despite his family’s forcible displacement from their country of origin, his early experience with organized sports, and the value of mentorship. Hassan describes how these experiences led him to give back to his family and community by becoming a Mental Health First Aider and a mentor for refugees seeking a better life.

article thumbnail

How Confluent Treats Incidents in the Cloud

Confluent

Fast infrastructure growth often comes with issues. Don't panic - learn from them! Here's how we analyze, monitor, and fix incidents at Confluent, and what we do to prevent risk.

Cloud 52
article thumbnail

A Structured Approach To Building a Machine Learning Model

KDnuggets

This article gives you a glimpse of how to approach a machine learning project with a clear outline of an easy-to-implement 5-step process.

article thumbnail

Accelerate testing in Apache Airflow through DAG versioning

Zalando Engineering

Introduction In the Performance Marketing department, we run paid advertisement campaigns for Zalando. To do so, we build services that allow us to manage campaigns, optimize and distribute content, and measure the performance of the campaigns at scale. Talking about measurement, one of the core systems we’ve built and continuously extended over the years is our so-called marketing ROI (return on investment) pipeline.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

Streaming Edge Data Collection and Global Data Distribution

Cloudera

In the first blog of the Universal Data Distribution blog series , we discussed the emerging need within enterprise organizations to take control of their data flows. From origin through all points of consumption both on-prem and in the cloud, all data flows need to be controlled in a simple, secure, universal, scalable, and cost-effective way. With the rapid increase of cloud services where data needs to be delivered (data lakes, lakehouses, cloud warehouses, cloud streaming systems, cloud busi

article thumbnail

Data Engineering Annotated Monthly – May 2022

Big Data Tools

It’s the start of June. That means it’s time to start taking summer vacations and enjoying some fresh juice alongside your fresh news! Hi, I’m Pasha Finkelshteyn , and I’ll be your guide through this month’s news. I’ll offer my impressions of recent developments in the data engineering space and highlight new ideas from the wider community.

article thumbnail

How is Data Mining Different from Machine Learning?

KDnuggets

How about we take a closer look at data mining and machine learning so we know how to catch their different ends?

article thumbnail

Apache Hop 2.0 released!

know.bi

The Apache Hop PMC and community released Apache Hop 2.0.0 late last week. This is the second major release of the platform and the first major release after Hop graduated as a Top-Level ASF Project.

Project 52
article thumbnail

Improving the Accuracy of Generative AI Systems: A Structured Approach

Speaker: Anindo Banerjea, CTO at Civio & Tony Karrer, CTO at Aggregage

When developing a Gen AI application, one of the most significant challenges is improving accuracy. This can be especially difficult when working with a large data corpus, and as the complexity of the task increases. The number of use cases/corner cases that the system is expected to handle essentially explodes. 💥 Anindo Banerjea is here to showcase his significant experience building AI/ML SaaS applications as he walks us through the current problems his company, Civio, is solving.

article thumbnail

How Do We Transform and Model Data at Cloud Academy?

Cloud Academy

How Do We Transform and Model Data at Cloud Academy? “Data is the new gold”: a common phrase over the last few years. For all organizations, data and information have become crucial to making good decisions for the future and having a clear understanding of how they’re making progress — or otherwise. At Cloud Academy, we strive to make data-informed decisions.

Cloud 52
article thumbnail

Data Engineering Annotated Monthly – May 2022

Big Data Tools

It’s the start of June. That means it’s time to start taking summer vacations and enjoying some fresh juice alongside your fresh news! Hi, I’m Pasha Finkelshteyn , and I’ll be your guide through this month’s news. I’ll offer my impressions of recent developments in the data engineering space and highlight new ideas from the wider community.

article thumbnail

Top Posts May 30 – June 5: 21 Cheat Sheets for Data Science Interviews

KDnuggets

Also: Decision Tree Algorithm, Explained; How to Become a Machine Learning Engineer; The Complete Collection of Data Science Books – Part 2; 15 Python Coding Interview Questions You Must Know For Data Science.

article thumbnail

MongoDB vs DynamoDB Head-to-Head: Which Should You Choose?

Rockset

Note: We have updated this post to reflect comments and corrections we received from readers. We thank those who sent in comments for helping us make this post more accurate and useful. — Editor Databases are a key architectural component of many applications and services. Traditionally, organizations have chosen relational databases like SQL Server, Oracle , MySQL and Postgres.

MongoDB 52
article thumbnail

The Ultimate Guide To Data-Driven Construction: Optimize Projects, Reduce Risks, & Boost Innovation

Speaker: Donna Laquidara-Carr, PhD, LEED AP, Industry Insights Research Director at Dodge Construction Network

In today’s construction market, owners, construction managers, and contractors must navigate increasing challenges, from cost management to project delays. Fortunately, digital tools now offer valuable insights to help mitigate these risks. However, the sheer volume of tools and the complexity of leveraging their data effectively can be daunting. That’s where data-driven construction comes in.

article thumbnail

Building An External Data Product Is Different. Trust Me. (but read this anyway)

Monte Carlo

The data world moves unapologetically fast. It seems like just last year we started talking about how data teams were transitioning from providing a service, to treating data like a product or even building internal products across a decentralized data mesh architecture. Wait, that was *checks notes* January of this year?? Wow. Who knows, maybe Ferris Bueller became a data engineer.

article thumbnail

Stateful Streams with Apache Pulsar and Apache Flink

Rock the JVM

Discover how to integrate Apache Pulsar with Apache Flink: perform advanced data enrichment using state from multiple topics

Data 52
article thumbnail

KDnuggets News, June 8: 21 Cheat Sheets for Data Science Interviews; Top 18 Data Science Group on LinkedIn

KDnuggets

21 Cheat Sheets for Data Science Interviews; Top 18 Data Science Group on LinkedIn; A Beginner's Guide to Q Learning; 3 Ways Understanding Bayes Theorem Will Improve Your Data Science; Machine Learning Is Not Like Your Brain Part 3: Fundamental Architecture.

article thumbnail

Is the 4-Year Degree Obsolete?

Elder Research

The post Is the 4-Year Degree Obsolete? appeared first on Elder Research.

52
article thumbnail

Business Intelligence 101: How To Make The Best Solution Decision For Your Organization

Speaker: Evelyn Chou

Choosing the right business intelligence (BI) platform can feel like navigating a maze of features, promises, and technical jargon. With so many options available, how can you ensure you’re making the right decision for your organization’s unique needs? 🤔 This webinar brings together expert insights to break down the complexities of BI solution vetting.

article thumbnail

Snowflake Observability and 4 Reasons Data Teams Should Invest In It

Monte Carlo

Adopting a cloud data warehouse like Snowflake is an important investment for any organization that wants to get the most value out of their data. The Forrester’s Total Economic Impact of Snowflake report uncovered a customer ROI of 612% with total benefits of more than $21 million across three years. ? This immediate value is just scratching the surface.

IT 52
article thumbnail

Roadmap to Becoming a Successful Data Engineer

Rock the JVM

Discover key insights from one of Rock the JVM's standout students on building a successful career in Data Engineering

article thumbnail

Understanding Functions for Data Science

KDnuggets

Most data science problems boil down to finding the mathematical function that describes the relationship between feature and target variables.

article thumbnail

Must-haves on Your Data Science Resume

KDnuggets

Recruiters look at a resume for 7.4 seconds before making a decision on the candidate. So that means you have basically less than 10 seconds to make a good impression. 10 seconds is not a lot of time; especially when you really want this job.

article thumbnail

Driving Responsible Innovation: How to Navigate AI Governance & Data Privacy

Speaker: Aindra Misra, Senior Manager, Product Management (Data, ML, and Cloud Infrastructure) at BILL

Join us for an insightful webinar that explores the critical intersection of data privacy and AI governance. In today’s rapidly evolving tech landscape, building robust governance frameworks is essential to fostering innovation while staying compliant with regulations. Our expert speaker, Aindra Misra, will guide you through best practices for ensuring data protection while leveraging AI capabilities.