This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookie Settings
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Strictly Necessary: Used for the proper function of the website
Performance/Analytics: Used for monitoring website traffic and interactions
Summary Building an ETL pipeline is a common need across businesses and industries. It’s easy to get one started but difficult to manage as new requirements are added and greater scalability becomes necessary. Rather than duplicating the efforts of other engineers it might be best to use a hosted service to handle the plumbing so that you can focus on the parts that actually matter for your business.
At Uber, we combine real-time systems monitoring with intelligent alerting mechanisms to ensure the availability and reliability of our apps. In our push to empower our engineers to author more accurate alerts, Uber’s Observability Applications team sought to introduce alert … The post Engineering a Job-based Forecasting Workflow for Observability Anomaly Detection appeared first on Uber Engineering Blog.
Comparing Redux, MobX & setState in React by Kaiser Anwar Shad and revised by Eugen Kiss Introduction React is a declarative, efficient, and flexible JavaScript library for building user interfaces. Compared to other frontend libraries and frameworks, React’s core concept is simple but powerful: ‘React makes it painless to design simple views and renders by using virtual DOM’.
Digital Transformation Definition Digital transformation has been a big topic for a few years now, and it has many definitions. From a business perspective, digital transformation is about leveraging digital technologies to improve processes, competencies, and business models. It is also about changing the culture of the company because it requires letting go of old.
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
A 2017 IDC White Paper “recommend[s] that organizations that want to get the most out of cloud should train a wide range of stakeholders on cloud fundamentals and provide deep training to key technical teams ” (emphasis ours). Regular readers of the Cloud Academy blog know we’ve been talking about this for a long time. Future-proofing your organization requires technical excellence, collective experience, business context, and shared understanding.
Global Technology Partner of the Year. A Cloudera contingent, including Business Development, Marketing, Sales, and Software Engineering just got back from Qlik Qonnections in Orlando, Florida where the vibe to #DisruptEverything was strong! Qonnections is the premier Business Intelligence (BI) conference that brings data lovers together to share insights and discover data in new ways.
Summary Most businesses end up with data in a myriad of places with varying levels of structure. This makes it difficult to gain insights from across departments, projects, or people. Presto is a distributed SQL engine that allows you to tie all of your information together without having to first aggregate it all into a data warehouse. Kamil Bajda-Pawlikowski co-founded Starburst Data to provide support and tooling for Presto, as well as contributing advanced features back to the project.
Summary Most businesses end up with data in a myriad of places with varying levels of structure. This makes it difficult to gain insights from across departments, projects, or people. Presto is a distributed SQL engine that allows you to tie all of your information together without having to first aggregate it all into a data warehouse. Kamil Bajda-Pawlikowski co-founded Starburst Data to provide support and tooling for Presto, as well as contributing advanced features back to the project.
News on Hadoop - April 2018 Big Data and Cambridge Analytica: 5 Big Picture Truths.Datamation.com, April 2, 2018. Cambridge Analytica is in the news headlines as people realize the fact that how Trump campaign and its donors used FB ads and created a sociopolitical shakeup.Let’s understand what big truths does the #deletefacebook movement reveal and what implications it has - i) This data leak is not due to any breach but is a quite common and nothing unusual.
Our cooperation with CODE University At Zalando, we strive to create an environment in which all our engineers, product, and design specialists feel they can inspire each other, make their ideas a reality, and contribute to providing the best possible platform for Zalando’s customers to have the ultimate customer experience. Part of this is making sure we understand what the future generation of product managers, interaction designers, and software engineers are thinking and what ideas and innov
Summary The Open Data Science Conference brings together a variety of data professionals each year in Boston. This week’s episode consists of a pair of brief interviews conducted on-site at the conference. First up you’ll hear from Andy Eschbacher of Carto. He dscribes some of the complexities inherent to working with geospatial data, how they are handling it, and some of the interesting use cases that they enable for their customers.
Summary The Open Data Science Conference brings together a variety of data professionals each year in Boston. This week’s episode consists of a pair of brief interviews conducted on-site at the conference. First up you’ll hear from Alan Anders, the CTO of Applecart about their challenges with getting Spark to scale for constructing an entity graph from multiple data sources.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
Over the past decade, we have observed open source powered big data and analytics platforms evolve from large data storage containers to massively scalable advanced modeling platforms that seamlessly operate on-premises and in a multi-cloud environment. Cloudera Enterprise is the market leader in this space and working closely with the wider open source community to integrate the latest innovations in machine learning and artificial intelligence (AI).
The Keystone to a Holistic Trade Surveillance Strategy. Criminals don’t refer to a playbook of best practices to execute a crime. They are creative in their thinking and collaborative in their efforts (including with parties who may not know they are complicit) to obtain their objectives and avoid getting caught. Trade surveillance in the financial services industry must respond to their creativity and the complexities of a global trading network, the myriad of digital and physical communication
Cloudera Data Science Workbench (CDSW) makes secure, collaborative data science at scale a reality for the enterprise and accelerates the delivery of new data products. With CDSW, organizations can research and experiment faster, deploy models easily and with confidence, as well as rely on the wider Cloudera platform to reduce the risks and costs of data science projects.
Cloudera 2017 Data Impact Award Winners. We are excited to kick off the 2018 Data Impact Awards ! Nominations are now open and will remain open until July 20th. Since 2012, the Data Impact Awards have showcased how organizations are using Cloudera and the power of data to transform themselves and achieve dramatic results. Click to tweet : Nominations are now open for the sixth annual Cloudera Data Impact Awards!
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Why Enterprises Need to Unify ML, Analytics, and Cloud. Times are changing, and the traditional models of analytics and data management don’t serve the needs of the modern enterprise, so the way to address these topics is changing too. While organizations are moving more workloads to the cloud, many mission-critical workloads remain on-prem. End users, data stewards, governance groups, and security groups alike can easily get overwhelmed with multiple access points, inconsistent user interfaces,
It was exactly one year ago at Strata London that we introduced the world to Cloudera Altus Data Engineering. The premise was simple: make it quicker and easier for customers to drive data to their machine learning and analytics services by leveraging cloud resources, while at the same time, eliminating the pain associated with managing datacenter or cloud infrastructure.
Sharing successful large scale agile experiences Zalando has been known for radical approaches to agility since 2015. In order to keep growing and staying successful we took the next step in 2017 forming around 30 business units. Each business unit is formed around one particular business problem, topic or product with end2end responsibility. All disciplines needed are inside this business unit from commercial roles to tech teams.
How to establish and evolve your data science community In Zalando, we have many guilds: self-organized groups of people who share interests. The topics, scope, size, and ways to organize the guilds varies. We have technical guilds like the web or API guilds , local and artistic guilds like the knitting guild in Helsinki, and some guilds that support the growth of people in certain job families, like the Data Science Guild.
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
Moving from a more traditional internal IT setup to a product-driven culture I love building enterprise systems, because you get to work with your customers/users every day and literally see their lives change as you release new features. In my case, at Zalando, these are systems for fashion buying, supply chain management, inventory management and procure-to-pay processes (e.g. paying our suppliers for merchandise we bought from them).
Real-time joins in event-driven microservices As discussed in my previous blog post , Kafka is one of the key components of our event-driven microservice architecture in Zalando’s Smart Product Platform. We use it for sequencing events and building an aggregated view of data hierarchies. This post expands on what I previously wrote about the one-to-many data model and introduces more complex many-to-many relationships.
Celebrating Zalando’s first international tech hub Three years ago, Zalando decided to start looking beyond Germany’s borders to tap into Europe’s pool of tech talent. Diverse and brilliant minds from other European cities and beyond contributed to cementing Zalando’s place as Europe’s most fashionable tech company. So, back in 2015, Zalando’s first move was across the Irish Sea, and now the team is very excited to celebrate its third anniversary!
We organize all of the trending information in your field so you don't have to. Join 37,000+ users and stay up to date on the latest articles your peers are reading.
You know about us, now we want to get to know you!
Let's personalize your content
Let's get even more personalized
We recognize your account from another site in our network, please click 'Send Email' below to continue with verifying your account and setting a password.
Let's personalize your content