This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Strictly Necessary: Used for the proper function of the website
Performance/Analytics: Used for monitoring website traffic and interactions
To effectively use ksqlDB, the streaming database for Apache Kafka®, you should of course be familiar with its features and syntax. However, a deeper understanding of what goes on underneath […].
Humans have been trying to make machines chat for decades. Alan Turing considered computers’ ability to generate natural speech a proof of their ability to think. Today, we converse with virtual companions all the time. But despite years of research and innovation, their unnatural responses remind us that no, we’re not yet at the HAL 9000-level of speech sophistication.
Apache Ozone is a scalable distributed object store that can efficiently manage billions of small and large files. Ozone natively provides Amazon S3 and Hadoop Filesystem compatible endpoints in addition to its own native object store API endpoint and is designed to work seamlessly with enterprise scale data warehousing, machine learning and streaming workloads.
Summary The reason that so much time and energy is spent on data integration is because of how our applications are designed. By making the software be the owner of the data that it generates, we have to go through the trouble of extracting the information to then be used elsewhere. The team at Cinchy are working to bring about a new paradigm of software architecture that puts the data as the central element.
In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate
When companies need help with their vehicle fleets—including transport, storage, or renewing expired registrations—they don’t want to have to deal with multiple vehicle logistics providers. For these companies, ACERTUS provides […].
DataOps has become an essential methodology in pharmaceutical enterprise data organizations, especially for commercial operations. Companies that implement it well derive significant competitive advantage from their superior ability to manage and create value from data. They will be able to produce high-quality, on-demand insight that consistently leads to successful business decisions.
Cloudera recently appointed a Cloud Director for Asia Pacific (APAC), Stevie Walsh, to help drive our hybrid and multi-cloud offerings in the region, supporting our customers in accelerating their digital transformation journey. We’ve asked her to share her cloud vision for Cloudera in APAC and the exciting plans that she has in her new position. What drew you to work in the cloud space?
Cloudera recently appointed a Cloud Director for Asia Pacific (APAC), Stevie Walsh, to help drive our hybrid and multi-cloud offerings in the region, supporting our customers in accelerating their digital transformation journey. We’ve asked her to share her cloud vision for Cloudera in APAC and the exciting plans that she has in her new position. What drew you to work in the cloud space?
Summary The technological and social ecosystem of data engineering and data management has been reaching a stage of maturity recently. As part of this stage in our collective journey the focus has been shifting toward operation and automation of the infrastructure and workflows that power our analytical workloads. It is an encouraging sign for the industry, but it is still a complex and challenging undertaking.
Sometimes you’d like to write your own code for producing data to an Apache Kafka® topic and connecting to a Kafka cluster programmatically. Confluent provides client libraries for several different […].
As Back to School promotions hit the shelves, Christmas & New Year offers are already locked in. Are these long-lead cycles still effective in today’s dynamic Retail & CPG environment?
There is an urgent need for banks to be nimble and adaptable in the thick of a multitude of industry challenges, ranging from the maze of regulatory compliance, sophisticated criminal activities, rising customer expectations and competition from traditional banks and new digital entrants. As banks find their bearings in this landscape, what appear to be insurmountable odds are in fact opportunities for growth and competitive differentiation. .
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
This blog introduces the critical differences that one encounters when anyone performs an analysis of logistic regression vs linear regression. Firstly, we introduce the two machine learning algorithms in detail and then move on to their practical applications to answer questions like when to use linear regression vs logistic regression. Table of Contents Linear Regression vs Logistic Regression - How are they related ?
The world is moving from batch to real-time. With Confluent’s recent IPO, streaming data has officially gone mainstream, “becoming the underpinning of a modern digital customer experience, and the key to driving intelligent, efficient operations” to quote from their letter to shareholders. But while it’s easier to stream the data, analyzing it in real time still involves too much cost and complexity.
As 5G puts data analytics at the heart of the next wave of sustainable growth, telcos must ensure their existing investments in data infrastructure can be leveraged to enable that growth.
Meet Barnabas Maidics. . Barnabas is a 3 year Clouderan working as a Software Engineer in Hungary. . Having started his journey at Cloudera as an intern and then making his way to the Data In Motion team, Barnabas feels his first experience in the real world of work has allowed him to grow, not only professionally but on a personal level as well. He’s always known this was the career path for him.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
Most of us have observed that data scientist is usually labeled the hottest job of the 21st century, but is it the only most desirable job? For beginners or peeps who are utterly new to the data industry, Data Scientist is likely to be the first job title they come across, and the perks of being one usually make them go crazy. Within no time, most of them are either data scientists already or have set a clear goal to become one.
The State of Customer Data The Modern Data Stack is all about making powerful marketing and sales decisions and performing impactful business analytics from a single source of truth. Customer Data Integration makes this possible. Customers expect personalized experiences, connection, and relevancy. However, the fact of the matter is that without accurate, up-to-date data in a centralized location, your marketing team is missing out on opportunities.
In this edition of Partner Perspective, Cloudera’s own Rachel Tuller sits down with Craig Smith, Vice President of Data, AI and IoT at Tech Data. They discuss the importance of business partnerships, the pandemic’s impact on the tech industry, and Craig’s predictions about the industry going forward. Tech Data is one of the largest technology distributors globally.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Learn more about Cloud Snapshots, how they compare to traditional backups and how they can be deployed in your architecture to maximize data protection.
When it came to achieving data trust at Vimeo, Lior Solomon, VP of Engineering, Data, and his team were faced with an important choice: build or buy their data observability platform. After trying various solutions, they chose to partner with Monte Carlo, a decision that allowed them to “ literally jump into the future ” with the platform’s automatic detection and end-to-end visibility into their Looker and Snowflake pipelines in minutes — not days.
Why Apache Superset™, an open source data visualization and BI platform, is the most compelling Looker alternative, a closed-source BI platform by Google.
COVID-19 vaccines were developed in record time. One of the main reasons for the accelerated development was the quick exchange of data between academia, healthcare institutions, government agencies, and nonprofit entities. “COVID research is a great example of where sharing data and having large quantities of data to analyze would be beneficial to us all,” said Renee Dvir, solutions engineering manager at Cloudera.
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
Consider that you are with the following data table and its associated graph: Age Daily consumption Dairy Staple Food High-CalorieFood Supplements 0- 10 50 30 10 10 11- 30 35 45 15 5 31- 50 25 55 13 7 51- 80 40 40 4 16 Even if you’ve just skipped over the figures, you’d agree that the graph is at the very least a tad bit more memorable and appealing than data tables or text.
With submissions for the Data Impact Awards coming in, we’re revisiting last year’s winners to find out what set them apart. . In 2020, Telkomsel took home the gold in the Industry Transformation category. . The company stood out to the judges for taking its business to the next level by disrupting the telecommunication’s industry through the application of new technologies, skills, and operational processes.
We organize all of the trending information in your field so you don't have to. Join 37,000+ users and stay up to date on the latest articles your peers are reading.
You know about us, now we want to get to know you!
Let's personalize your content
Let's get even more personalized
We recognize your account from another site in our network, please click 'Send Email' below to continue with verifying your account and setting a password.
Let's personalize your content