Kafka Summit Americas 2021 Recap
Confluent
SEPTEMBER 16, 2021
The full inventory of three online Kafka Summits in 2021 is now complete. Kafka Summit Americas wrapped just yesterday. Being a part of the event team and the Program Committee, […].
Confluent
SEPTEMBER 16, 2021
The full inventory of three online Kafka Summits in 2021 is now complete. Kafka Summit Americas wrapped just yesterday. Being a part of the event team and the Program Committee, […].
Start Data Engineering
SEPTEMBER 16, 2021
1. Introduction 2. What is scaling & why do we need it? 3. Types of scaling 4. Choose your scaling strategy 5. Conclusion 6. Further reading 7. References 1. Introduction Choosing tools/frameworks to scale your data pipelines can be confusing. If you have struggled with Data pipelines that randomly crash Finding guides on how to scale your data pipelines from the ground up Then this post is for you.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Data Engineering Podcast
SEPTEMBER 12, 2021
Summary The Cassandra database is one of the first open source options for globally scalable storage systems. Since its introduction in 2008 it has been powering systems at every scale. The community recently released a new major version that marks a milestone in its maturity and stability as a project and database. In this episode Ben Bromhead, CTO of Instaclustr, shares the challenges that the community has worked through, the work that went into the release, and how the stability and testing
Teradata
SEPTEMBER 12, 2021
Groupon is modernizing with Vantage on AWS to better match its data & analytics with demands of its global business. The Cloud allows Groupon to better leverage infrastructure dollars, support more technology projects and capture opportunity.
Speaker: Tamara Fingerlin, Developer Advocate
Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.
Confluent
SEPTEMBER 15, 2021
GraphQL and Apache Kafka® are sometimes troubled with misconceptions. One of the reasons for this is that people are often familiar with one but not the other. GraphQL is mostly […].
DataKitchen
SEPTEMBER 14, 2021
The post What Should Enterprises Do to Offset Future Technology Disruption? first appeared on DataKitchen.
Data Engineering Digest brings together the best content for data engineering professionals from the widest variety of industry thought leaders.
Cloudera
SEPTEMBER 14, 2021
In June, we announced the beginning of a new chapter for Cloudera, with a mission to make data and analytics easy and accessible, for everyone. With transformation comes change, and today I’m thrilled to announce the promotion of Sudhir “Suds” Menon, Ram Venkatesh and Paul Codding, three leaders driving our mission forward. The foundation of our mission is a move to a hybrid data cloud platform , an evolution of our Cloudera Data Platform , a hybrid and multi-cloud solution purpose-built with th
Confluent
SEPTEMBER 14, 2021
Data governance initiatives aim to manage the availability, integrity, and security of data used across an organization. With the explosion in volume, variety, and velocity of data powering the modern […].
DataKitchen
SEPTEMBER 17, 2021
Below is our final post (5 of 5) on combining data mesh with DataOps to foster innovation while addressing the challenges of a data mesh decentralized architecture. We see a DataOps process hub like the DataKitchen Platform playing a central supporting role in successfully implementing a data mesh. DataOps excels at the type of workflow automation that can coordinate interdependent domains, manage order-of-operations issues and handle inter-domain communication.
Netflix Tech
SEPTEMBER 13, 2021
Written by Jose Fernandez , Arthur Gonigberg , Julia Knecht , and Patrick Thomas In 2017, Netflix Studios was hitting an inflection point from a period of merely rapid growth to the sort of explosive growth that throws “how do we scale?” into every conversation. The vision was to create a “Studio in the Cloud”, with applications supporting every part of the business from pitch to play.
Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage
There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.
Datakin
SEPTEMBER 14, 2021
Blog Building a Remote-First Culture Written by Amanda Bulger on Sep 15, 2021 This morning I was planning an offsite for our team – our first one since Datakin was founded during the pandemic – and I had a realization: I haven’t met most of these people in person yet! We’ve been working together for months and months, solving interesting problems and planning social events, but we have been restricted to knowing each other through a tiny box on a screen.
ProjectPro
SEPTEMBER 16, 2021
Looking to land a job as a data analyst or a data scientist, SQL is a must-have skill on your resume. Everyone uses SQL to query data and perform analysis, from the biggest names in tech like Amazon, Netflix, and Google to fast-growing seed-stage startups in data. Before the world was taken over by the buzz of data science and analytics, data management still existed.
Rockset
SEPTEMBER 14, 2021
We’re introducing a new Rockset Integration for Apache Kafka that offers native support for Confluent Cloud and Apache Kafka, making it simpler and faster to ingest streaming data for real-time analytics. This new integration comes on the heels of several new product features that make Rockset more affordable and accessible for real-time analytics including SQL-based rollups and transformations.
dbt Developer Hub
SEPTEMBER 14, 2021
Hello there, Do you remember? The 21st day of September? ? Course you do it was two days ago. Well that's a win in your bucket and the day's barely begun! So let's get a win for someone else -- like Jeremy Cohen, the dbt Core product manager. I'm sure you know that half of the updates in this email are pushed automatically when we upgrade everyone to the latest version of dbt Cloud ?
Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives
Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri
Teradata
SEPTEMBER 14, 2021
Learn what Mitsui Sumitomo Insurance, one of Japan’s leading insurance and finance groups, has achieved through leveraging the power of data analytics.
ProjectPro
SEPTEMBER 16, 2021
NoSQL databases are the new-age solutions to distributed unstructured data storage and processing. The speed, scalability, and fail-over safety offered by NoSQL databases are needed in the current times in the wake of Big Data Analytics and Data Science technologies. The edge that NoSql provides over their SQL counterparts is high scalability and faster read/write performances, highly appreciated features in Distributed Systems.
Preset
SEPTEMBER 13, 2021
Here we go step by step to build an open-source ingestion layer with Airbyte.
Datakin
SEPTEMBER 13, 2021
Blog Datakin in 104 seconds Written by Ross Turk on Sep 13, 2021 Hi! I’m Ross from Datakin I’d like to show you a new approach to keeping your pipelines running smoothly. Datakin observes your jobs as they run, collecting metadata that helps you understand how data flows through your ecosystem. We believe that lineage provides the context required to keep troubleshooting and resolution times low.
Advertisement
With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you
Zalando Engineering
SEPTEMBER 12, 2021
2016 - First attempt at rolling out SRE Welcome to the first installment of our three part series following Zalando’s SRE journey. Be sure to come back for the other two, with the next one being published in a week. Site Reliability Engineering (SRE) is a recent discipline in the Software Engineering field that is growing in popularity, with many companies turning to this new way of working to solve their operational issues, or to support its growing scale.
ProjectPro
SEPTEMBER 16, 2021
87% of Data Science Projects never make it to production - VentureBeat According to an analytics firm, Cognilytica, the MLOps market is anticipated to be worth $4 billion by end of 2025. Jobs over the next decade will be built on top of Data Science, but for production. Data Science has flourished over the decade on the promise that organizations will leverage analytics for profitable business decision-making.
Teradata
SEPTEMBER 12, 2021
The new Vantage BYOM feature allows data scientists and data engineers to finally operationalize all their predictive models. Find out more.
RudderStack
SEPTEMBER 16, 2021
Learn how Touchless used a data-first design approach and leveraged RudderStack to help Wavedirect 4X Leads with dynamically generated SEO landing pages.
Speaker: Tamara Fingerlin, Developer Advocate
In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!
Cloudera
SEPTEMBER 15, 2021
Leveraging the Internet of Things (IoT) allows you to improve processes and take your business in new directions. But it requires you to live on the edge. That’s where you find the ability to empower IoT devices to respond to events in real time by capturing and analyzing the relevant data. Edge computing relies on squeezing the power and functionality of a data center into a micro site as close to data sources as possible to enable real-time tasks.
ProjectPro
SEPTEMBER 14, 2021
Cloud technology is widely used, with 94% of enterprises already using one or multiple cloud services. In a couple of years, the public cloud market will reach $623.3 billion. There are abundant options available in the cloud technology market, with AWS and Openstack as the two trendy choices. Table of Contents AWS vs. OpenStack - A Head to Head Comparison OpenStack vs.
ProjectPro
SEPTEMBER 14, 2021
As a data science aspirant, you would have probably come across the following phrase more than once: “A data scientist is a person who is better at statistics than any programmer and better at programming than any statistician.” Before data science became a well-known career path, companies would hire statisticians to process their data and develop insights based on trends observed.
RudderStack
SEPTEMBER 14, 2021
This guide covers how to send data from your website to customer.io in less than five minutes.
Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage
When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m
ProjectPro
SEPTEMBER 11, 2021
This blog contains a list of business analyst interview questions and answers. You will find it helpful if you are a hiring manager who is looking for business analyst questions to ask during an interview and also if you are a job seeker who is interested in business analyst jobs. Table of Contents Role of a Business Analyst: Skills and Opportunities 50 Business Analyst Interviews Questions and Answers Junior Business Analyst Interview Questions/Entry-level Business Analyst Interview Questions T
Cloudera
SEPTEMBER 13, 2021
About Cruise Control. There are two big gaps in the Apache Kafka project when we think of operating a cluster. The first is monitoring the cluster efficiently and the second is managing failures and changes in the cluster. There are no solutions for these inside the Kafka project but there are many good 3rd party tools for both problems. Cruise Control is one of the earliest open source tools to provide a solution for the failure management problem but lately for the monitoring problem as well.
Let's personalize your content