Sat.Aug 29, 2020 - Fri.Sep 04, 2020

article thumbnail

Creating a Serverless Environment for Testing Your Apache Kafka Applications

Confluent

If you are taking your first steps with Apache Kafka®, looking at a test environment for your client application, or building a Kafka demo, there are two “easy button” paths […].

Kafka 133
article thumbnail

Where to start if you want to become a Data Engineer

Team Data Science

"Where can I start if I want to become a Data Engineer?" This is a question I have heard many times before. My answer to it is actually always the same: Start doing a Data Engineering project! Choose a tool Your first step here should be to select a tool. Then start with that tool and then build the whole thing up. So you get some data and then start with a tool.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Streaming Analytics in the Real World

Cloudera

From leading banks, and insurance organizations to some of the largest telcos, manufacturers, retailers, healthcare and pharma, organizations across diverse verticals lead the way with real-time data and streaming analytics. These businesses use data-fueled insights to enhance the customer experience, reduce costs, and increase revenues. And Cloudera is at the heart of enabling these real-time data driven transformations. .

Insurance 101
article thumbnail

Building A Better Data Warehouse For The Cloud At Firebolt

Data Engineering Podcast

Summary Data warehouse technology has been around for decades and has gone through several generational shifts in that time. The current trends in data warehousing are oriented around cloud native architectures that take advantage of dynamic scaling and the separation of compute and storage. Firebolt is taking that a step further with a core focus on speed and interactivity.

article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

The Power of Data and Analytic Processing Gravity

Teradata

Taking the definition of physical gravity & extending it to data analytics, we explore the opportunities to combine data gravity with analytic processing, at scale with Vantage.

Process 93
article thumbnail

Important countries and regions with Data Science demand

Team Data Science

In which regions or countries is there a boom in the field of Data Sciences and thus a large number of jobs? This is a very interesting question, which newcomers or graduates often ask themselves. Maybe you have already asked yourself this question? The USA as an advanced country Companies in the USA are obviously very, very advanced with Data Science.

More Trending

article thumbnail

Enabling the Deployment of Event-Driven Architectures Everywhere Using Microsoft Azure and Confluent Cloud

Confluent

Hybrid cloud architecture and accelerated cloud migrations are becoming the norm rather than the exception, as our increasingly digital world introduces certain challenges along the way, including modernizing existing application/architecture, […].

article thumbnail

Key Challenges with Quasi Experiments at Netflix

Netflix Tech

Kamer Toker-Yildiz , Colin McFarland , Julia Glick At Netflix, when we can’t run A/B experiments we run quasi experiments ! We run quasi experiments with various objectives such as non-member experiments focusing on acquisition, member experiments focusing on member engagement, or video streaming experiments focusing on content delivery. Consolidating on one methodology could be a challenge, as we may face different design or data constraints or optimization goals.

Media 75
article thumbnail

Why you should not learn everything in Data Science

Team Data Science

"Since I started exploring Data Engineering, it has been overwhelming. In the end I have the feeling of giving up." This is a message that reached me from a viewer on YouTube. And that's exactly how I feel sometimes! Sometimes I feel a bit overwhelmed by the whole thing. Because there is so much going on. All the technology and Data Science hype. There is always something new on the horizon.

article thumbnail

The Advantages Of Live Data-Streaming In The Competitive Financial Services Sector (Part III)

Cloudera

Live data-streaming offers businesses exciting new opportunities to transform the way they operate, leveraging real-time insights to drive better decision making and enhance operational efficiency. To find out more about how streaming data might impact the financial services sector I sat down for a chat with Dinesh Chandrasekhar, Head of Product Marketing in Cloudera’s Data-in-Motion Business Unit.

Kafka 98
article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Activating Intent: How Confluent is Energizing Its Diversity, Equity, and Inclusion Practice

Confluent

As a South African, who grew up during the era of apartheid, I’ve witnessed firsthand the negative and long-lasting impact of discrimination, bias, and inequity, and I have a strong […].

IT 64
article thumbnail

HDMI?—?Scaling Netflix Certification

Netflix Tech

HDMI?—?Scaling Netflix Certification Scott Bolter , Matthew Lehman , Akshay Garg ¹ At Netflix, we take the task of preserving the creative vision of our content all the way to a subscriber TV screen very seriously. This significantly increases the scope of our application integration and certification processes for streaming devices like set-top-boxes (STBs) and TVs.

article thumbnail

Workflow for creating a Data Engineering project and how you can build one!

Team Data Science

You want to become a data engineer, but don't know how to set up a data engineering project? I will show you! Do not make this mistake! First of all you should not make the mistake that unfortunately many people make! Often people want to build the whole thing from the beginning. They say: "Okay I need to do a project. I need to make a big thing. I don't even know what data and what tools I want to use.

article thumbnail

The Future Of The Telco Industry And Impact Of 5G & IoT – Part 3

Cloudera

Article 3. The Future Of The Telco Industry And Impact Of 5G & IoT – Part 3. In the final installment in the series, Vijay Raja, Director of Industry & Solutions Marketing at Cloudera shares his views on how the telecom sector is changing and where it goes next. Hi Vijay, thank you so much for joining us again. To continue where we left off, how are ML and IoT influencing the Telecom sector, and how is Cloudera supporting this industry evolution?

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Underscores are Overloaded in Scala!

Rock the JVM

Scala syntax can be confusing: discover almost all uses of underscores and why understanding their inconsistent philosophy is worthwhile

Scala 52
article thumbnail

How real-time stream processing works with ksqlDB, in 7 animations

Confluent

ksqlDB, the event streaming database, is becoming one of the most popular ways to work with Apache Kafka®.

Process 52
article thumbnail

Why You Need Data Engineers And Data Scientists To Be Successful!

Team Data Science

Data Science , Artificial Intelligence and Machine Learning. These topics are currently the hype in the field of Data Science. Everyone wants to become a Data Scientist. But isn't the work being done in the field of Data Engineereing the real MVP? Isn't it important to have Data Scientists AND Data Engineers on board to make a project successful? Yes, it is!

article thumbnail

Discover and Explore Data Faster with the CDP DDE Template

Cloudera

From a-z in 10 minutes! It is hard to believe if you have had previous experience with setting up, sizing, and deploying a distributed search engine service that this is possible. Imagine how many times IT has lost valuable time spending hours trying to understand Apache Solr application requirements and map them into how to best size and deploy the Solr service.

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Underscores are Overloaded in Scala!

Rock the JVM

Scala syntax can be confusing: discover almost all uses of underscores and why understanding their inconsistent philosophy is worthwhile

Scala 52
article thumbnail

How ksqlDB works in 7 animations

Confluent

ksqlDB, the event streaming database, is becoming one of the most popular ways to work with Apache Kafka.

Kafka 52
article thumbnail

Helping traditional organizations being more efficient

DareData

The world of data science and information technology is a constantly evolving landscape, where dozens of new tools and methodologies are created and updated daily, and many others quickly become obsolete. Every organization has their own ecosystem of applications, but even the most advanced organizations sometimes fall behind in certain areas when compared to the bleeding edge of technological advances.

article thumbnail

CDP Private Cloud is a Game-changer for Partners

Cloudera

Recently, Cloudera announced the release of Cloudera CDP Private Cloud, delivering the final component of our hybrid cloud strategy. There’s nothing comparable to it in the industry. CDP Private Cloud offers benefits of a public cloud architecture—autoscaling, isolation, agile provisioning, etc.—in an on-premise environment. Additionally, lines of business (LOBs) are able to gain access to a shared data lake that is secured and governed by the use of Cloudera Shared Data Experience (SDX).

Cloud 69
article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

Back to school – CEOs need to learn a new language, fast!

Teradata

CEOs of banks know all about change. But the existential challenge posed by Big Tech requires a totally new set of skills. What do they need to learn to survive?

Banking 52
article thumbnail

Preset & Superset User Documentation

Preset

Preset and Superset User Documentation is available for everyone that want to become a superset expert

52
article thumbnail

Repartition vs Coalesce in Apache Spark

Rock the JVM

Clarifying the differences between two essential repartitioning operations in Apache Spark

52
article thumbnail

Migration Supporting Real-Time Analytics for Customer Experience Management

Cloudera

Service Management Group ( SMG ) offers an easy-to-use experience management (XM) platform that combines end-to-end customer and employee experience management software with hands-on professional services to deliver actionable insights and help brands get smarter about their customers. The XM platform, smg360 , helps customers across verticals, including restaurants, retail, and healthcare, drive changes that boost loyalty and improve business outcomes. .

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Teradata Dynamic Resource Optimization – Both On-Premises and in the Cloud

Teradata

With Teradata Vantage on Azure, customers have access to the same dynamic resource optimization tools that they have come to love with the added agility that Azure brings to the table.

Cloud 52
article thumbnail

Offload Real-Time Reporting and Analytics from MongoDB Using PostgreSQL

Rockset

MongoDB’s Advantages & Disadvantages MongoDB has comprehensive aggregation capabilities. You can run many analytic queries on MongoDB without exporting your data to a third-party tool. However, these aggregation queries are frequently CPU-intensive and can block or delay the execution of other queries. For example, Online Transactional Processing (OLTP) queries are usually short read operations that have direct impacts on the user experience.

MongoDB 40
article thumbnail

Edgar: Solving Mysteries Faster with Observability

Netflix Tech

Edgar helps Netflix teams troubleshoot distributed systems efficiently with the help of a summarized presentation of request tracing, logs, analysis, and metadata. by Elizabeth Carretto Everyone loves Unsolved Mysteries. There’s always someone who seems like the surefire culprit. There’s a clear motive, the perfect opportunity, and an incriminating footprint left behind.

Metadata 119
article thumbnail

Larry H Miller Sports & Entertainment

Teradata

The Utah Jazz create winning customer experiences using Teradata Vantage on AWS with consumption pricing for flexible and elastic modern cloud analytics.

article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m