This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Strictly Necessary: Used for the proper function of the website
Performance/Analytics: Used for monitoring website traffic and interactions
As described in the blog post Apache Kafka® Needs No Keeper: Removing the Apache ZooKeeper Dependency, when KIP-500 lands next year, Apache Kafka will replace its usage of Apache ZooKeeper […].
Growing up, were you ever told you can’t have it all? That you can’t eat all the snacks in one sitting? That you can’t watch the complete Back to the Future trilogy as well as study for your science exam in one evening? Over time, we learn to set priorities, make a decision for one thing over the other, and compromise. Just like when it comes to data access in business.
Netflix Android and iOS Studio Apps?—?now powered by Kotlin Multiplatform By David Henry & Mel Yahya Over the last few years Netflix has been developing a mobile app called Prodicle to innovate in the physical production of TV shows and movies. The world of physical production is fast-paced, and needs vary significantly between the country, region, and even from one production to the next.
Summary One of the most challenging aspects of building a data platform has nothing to do with pipelines and transformations. If you are putting your workflows into production, then you need to consider how you are going to implement data security, including access controls and auditing. Different databases and storage systems all have their own method of restricting access, and they are not all compatible with each other.
In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate
A fundamental challenge with today’s “data explosion” is finding the best answer to the question, “So where do I put my data?” while avoiding the longer-term problem of data warehouses, […].
With the massive explosion of data across the enterprise — both structured and unstructured from existing sources and new innovations such as streaming and IoT — businesses have needed to find creative ways of managing their increasingly complex data lifecycle to speed time to insight. At Cloudera, we set out to directly address these lifecycle challenges through the Cloudera Data Platform (CDP) — the only hybrid-cloud, multi-cloud enterprise data platform built for the full data lifecycle. .
By Tianlong Chen and Ioannis Papapanagiotou Netflix has more than 195 million subscribers that generate petabytes of data everyday. Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. Usually Data scientists and engineers write Extract-Transform-Load (ETL) jobs and pipelines using big data compute technologies, like Spark or Presto , to process this data and perio
By Tianlong Chen and Ioannis Papapanagiotou Netflix has more than 195 million subscribers that generate petabytes of data everyday. Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. Usually Data scientists and engineers write Extract-Transform-Load (ETL) jobs and pipelines using big data compute technologies, like Spark or Presto , to process this data and perio
Our new global survey reveals how business leaders are changing the way they think about about data -- from their trust in to to the role it plays in a post-pandemic recovery.
If you want to enable your organization to leverage the full value of event-driven architectures, it is not enough to just integrate Apache Kafka® and wait for people to join […].
Being a woman in tech can be incredibly rewarding, lonely, frustrating and inspiring all at once. Each individual has their own experience and path that they’ve followed to get where they are. That is, after all, what makes us unique. Earlier this week Cindy Maike, VP Industry Solutions, hosted a panel discussion with women across the Cloudera EMEA business, working in a variety of different roles; each of us with diverse backgrounds and perspectives, which made for a wide-ranging discussion. .
Part of our series on who works in Analytics at Netflix?—?and what the role entails by Rocio Ruelas Back when we were all working in offices, my favorite days were Monday, Wednesday, and Friday. Those were the days with the best hot breakfast, and I’ve always been a sucker for free food. I started the day by arriving at the LA office right before 8am and finding a parking spot close to the entrance.
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
When Brian, Evan, and I first talked about starting a company, we already had some ideas in mind about what we might want to do differently from our past roles. The three of us had all worked together before at TaskRabbit , but since we were starting a brand new company, we decided to approach how we would work from a first principles approach. I thought we’d share some tidbits about how we work right now.
Cloudera and Dell/EMC are continuing our long and successful partnership of developing shared storage solutions for analytic workloads running in hybrid cloud. . Customer demand has always been the key driver of roadmap features on our platforms. Since the inception of Cloudera Data Platform (CDP), Dell / EMC PowerScale and ECS have been highly requested solutions to be certified by Cloudera.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
COVID-19 has forced many businesses to rethink their business models due to changes in customer requirements, but it has also opened up a world of new opportunities.
I’m excited to soon be the newest member of Rockset. I will be joining a truly spectacular engineering team, working on a product that leverages deep technical insights to make real-time analytics easy. My passion is building infrastructure that makes things simpler for users, supporting people at higher levels of the stack by giving them clean APIs and predictable behavior.
With both the US election and Veterans Day right around the corner, veterans at Cloudera have been telling us their stories, including why they want us all to Make Time to Vote. . The US has a long and evolving history with voting. Across generations, people have fought to gain and improve voting rights for all Americans, and in return, are honored for their sacrifices by having those rights be performed and visions fulfilled.
The customer has never been more right. Across industries, customers have become conditioned to demand not only near-instant responses to their needs but that their needs be anticipated in advance. Financial institutions are not given a pass, despite a competitive landscape flooded with regulation and privacy considerations. The customer still has expectations for a personalized, timely, and relevant experience.
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Cloudera services logs offer a breadth of information to assist in cluster maintenance; from assisting in security checks, auditing tasks, and validation for performance tuning and testing tasks – to name a few. . However, log records generated by these services do not hold the same value for every organisation. For example Cyber teams may find more value in logs that outline user behaviour when accessing the data, whilst operational teams may prefer logs that show the spikes in load time throug
At Cloudera Fast Forward we work to make the recently possible useful. Our goal is to take the incredible data science and machine learning research developments we see emerging from academia and large industrial labs, and bridge the gap to products and processes that are useful to practitioners working across industries. In the past year, we’ve released research reports and prototypes exploring Deep Learning for Anomaly Detection , Causality for Machine Learning and NLP for Automated Question A
Cloudera’s annual Data Impact Awards will be announced during a virtual celebration on November 18, 2020. If you’d like to join us and hear more about the winners, you can register here. . As always, we’re excited that the finalists represent a cross-section of industries. Personally, I’m thrilled to talk more about one of our healthcare finalists.
For too many retailers, brain & body have become separate, with data informing discrete projects & engagements but not used to transform entire business processes.
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
Today, Rockset announced $40M in Series B funding from Sequoia and Greylock , our two investors who have partnered with us right from the beginning. Additionally, we announced support for fully managed, secure private deployments of Rockset within a customer’s Amazon VPC. These are important milestones for both our company and product, but this announcement is less a celebration of Rockset than a recognition of our hundreds of beloved customers who have launched amazing real-time applications.
Many of us have become more conscious about how much activity we’re getting in a day--and it shows. Purchases for smartwatches that track calories and activities have dramatically increased since 2014. These smartwatches have helped people train for races, track different types of workouts, and be mindful of how much movement they are getting in a day.
We organize all of the trending information in your field so you don't have to. Join 37,000+ users and stay up to date on the latest articles your peers are reading.
You know about us, now we want to get to know you!
Let's personalize your content
Let's get even more personalized
We recognize your account from another site in our network, please click 'Send Email' below to continue with verifying your account and setting a password.
Let's personalize your content