Sat.Jun 10, 2023 - Fri.Jun 16, 2023

article thumbnail

Inside Agoda’s Private Cloud - Exclusive

The Pragmatic Engineer

👋 Hi, this is Gergely with the monthly, free issue of the Pragmatic Engineer Newsletter. In every issue, I cover challenges at Big Tech and startups through the lens of engineering managers and senior engineers. If you’re not a subscriber, you missed the issue on Shopify’s leveling split and a few others. Subscribe to get two full issues every week.

Cloud 251
article thumbnail

The Journey of a Senior Data Scientist and Machine Learning Engineer at Spice Money

Analytics Vidhya

Introduction Meet Tajinder, a seasoned Senior Data Scientist and ML Engineer who has excelled in the rapidly evolving field of data science. Tajinder’s passion for unraveling hidden patterns in complex datasets has driven impactful outcomes, transforming raw data into actionable intelligence. In this article, we explore Tajinder’s inspiring success story.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

A Comprehensive Guide to Convolutional Neural Networks

KDnuggets

Artificial Intelligence has been witnessing monumental growth in bridging the gap between the capabilities of humans and machines. Researchers and enthusiasts alike, work on numerous aspects of the field to make amazing things happen. One of many such areas is the domain of Computer Vision.

article thumbnail

Migrating Netflix to GraphQL Safely

Netflix Tech

By Jennifer Shin , Tejas Shikhare , Will Emmanuel In 2022, a major change was made to Netflix’s iOS and Android applications. We migrated Netflix’s mobile apps to GraphQL with zero downtime, which involved a total overhaul from the client to the API layer. Until recently, an internal API framework, Falcor , powered our mobile apps. They are now backed by Federated GraphQL , a distributed approach to APIs where domain teams can independently manage and own specific sections of the API.

Utilities 143
article thumbnail

A Guide to Debugging Apache Airflow® DAGs

In Airflow, DAGs (your data pipelines) support nearly every use case. As these workflows grow in complexity and scale, efficiently identifying and resolving issues becomes a critical skill for every data engineer. This is a comprehensive guide with best practices and examples to debugging Airflow DAGs. You’ll learn how to: Create a standardized process for debugging to quickly diagnose errors in your DAGs Identify common issues with DAGs, tasks, and connections Distinguish between Airflow-relate

article thumbnail

An explosion in software engineers using AI coding tools?

The Pragmatic Engineer

GitHub surveyed 500 developers in the US for a sense of how they use AI coding tools. I examine the results and add context on how the survey was conducted.

article thumbnail

The Journey of a Senior Data Scientist and Machine Learning Engineer in Fintech Domain

Analytics Vidhya

Introduction Meet Tajinder, a seasoned Senior Data Scientist and ML Engineer who has excelled in the rapidly evolving field of data science. Tajinder’s passion for unraveling hidden patterns in complex datasets has driven impactful outcomes, transforming raw data into actionable intelligence. In this article, we explore Tajinder’s inspiring success story.

More Trending

article thumbnail

Data News — Week 23.24

Christophe Blefari

The newsletter, a metaphor ( credits ) Hello, after the good weather comes the storm. I'm now under the Berlin rain with 20° When I write in these conditions I feel like a tortured author writing a depressing novel while actually today I'll speak about the AI Act, Python, SQL and data platforms. Casual day at the office finally. Some personal news, next Monday and Tuesday I'll be at Berlin Buzzwords, if you're ping me, it would be a pleasure to meet and hang together.

article thumbnail

What's new in Apache Spark 3.4.0 - Spark Connect

Waitingforcode

Spark Connect is probably the most expected feature in Apache Spark 3.4.0. It was announced in the Data+AI Summit 2022 keynotes and has a lot of coverage in social media right now. I'll try to add my small contribution to this by showing some implementation details.

Media 130
article thumbnail

How to become a valuable data engineer

Start Data Engineering

1. Introduction 2. Skills 2.1. Business Impact 2.1.1. Know your business 2.1.2. Money & Time 2.2. Technical skills 3. Build impactful projects 4. Conclusion 5. Further reading 1. Introduction So you are a new data engineer (or looking for a DE job) and want to better yourself as a data engineer. However, when you look at job postings or company tech stack, you are overwhelmed by the sheer amount of tools you have to learn!

article thumbnail

Your Ultimate Guide to Chat GPT and Other Abbreviations

KDnuggets

Everyone seems to have gone crazy about ChatGPT, which has become a cultural phenomenon. If you’re not on the ChatGPT train yet, this article might help you better understand the context and excitement around this innovation.

159
159
article thumbnail

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

Speaker: Tamara Fingerlin, Developer Advocate

Apache Airflow® 3.0, the most anticipated Airflow release yet, officially launched this April. As the de facto standard for data orchestration, Airflow is trusted by over 77,000 organizations to power everything from advanced analytics to production AI and MLOps. With the 3.0 release, the top-requested features from the community were delivered, including a revamped UI for easier navigation, stronger security, and greater flexibility to run tasks anywhere at any time.

article thumbnail

Build Better Tests For Your dbt Projects With Datafold And data-diff

Data Engineering Podcast

Summary Data engineering is all about building workflows, pipelines, systems, and interfaces to provide stable and reliable data. Your data can be stable and wrong, but then it isn't reliable. Confidence in your data is achieved through constant validation and testing. Datafold has invested a lot of time into integrating with the workflow of dbt projects to add early verification that the changes you are making are correct.

Project 130
article thumbnail

Unlock the Power of Real-time Data Processing with Databricks and Google Cloud

databricks

We are excited to announce the official launch of the Google Pub/Sub connector for the Databricks Lakehouse Platform. This new connector adds to.

article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

Netflix Tech

Migrating Critical Traffic At Scale with No Downtime — Part 2 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. Behind these perfect moments of entertainment is a complex mechanism, with numerous gears and cogs working in harmony.

Systems 114
article thumbnail

5 Free Julia Books For Data Science

KDnuggets

Discover the full potential of the Julia programming language for data analysis and modeling with a comprehensive guide that covers everything from its syntax to advanced techniques.

article thumbnail

Agent Tooling: Connecting AI to Your Tools, Systems & Data

Speaker: Alex Salazar, CEO & Co-Founder @ Arcade | Nate Barbettini, Founding Engineer @ Arcade | Tony Karrer, Founder & CTO @ Aggregage

There’s a lot of noise surrounding the ability of AI agents to connect to your tools, systems and data. But building an AI application into a reliable, secure workflow agent isn’t as simple as plugging in an API. As an engineering leader, it can be challenging to make sense of this evolving landscape, but agent tooling provides such high value that it’s critical we figure out how to move forward.

article thumbnail

Pivot your Perspective: Embracing the Full Power of Microsoft's Power Platform

FreshBI

Switching from Tableau to PowerBI is just the first step. At our company, we don't just transition you to a new data visualization tool. We connect you to an entire ecosystem of powerful, user-friendly solutions: the Microsoft Power Platform. Here's how we do it and why it's a game changer. The Starting Point: A Power-Packed Punch The Power Platform is Microsoft's suite of business analytics tools, which includes PowerBI , Power Apps , Power Automate , and Power Virtual Agents.

Coding 105
article thumbnail

Lakehouse Orchestration with Databricks Workflows

databricks

Organizations across industries are adopting the lakehouse architecture and using a unified platform for all their data, analytics and AI workloads. When moving.

article thumbnail

Understanding level of detail in Business Analyst’s color-coded maps

ArcGIS

The new color-coded map functionality introduces enhanced map visualization, interactive panels and level of detail control.

Coding 103
article thumbnail

How to MLOps like a Boss: A Guide to Machine Learning without Tears

KDnuggets

If you have ever emailed a.pickle file to engineers for deployment, this is for YOU!

article thumbnail

How to Modernize Manufacturing Without Losing Control

Speaker: Andrew Skoog, Founder of MachinistX & President of Hexis Representatives

Manufacturing is evolving, and the right technology can empower—not replace—your workforce. Smart automation and AI-driven software are revolutionizing decision-making, optimizing processes, and improving efficiency. But how do you implement these tools with confidence and ensure they complement human expertise rather than override it? Join industry expert Andrew Skoog as he explores how manufacturers can leverage automation to enhance operations, streamline workflows, and make smarter, data-dri

article thumbnail

Open-Sourcing AvroTensorDataset: A Performant TensorFlow Dataset For Processing Avro Data

LinkedIn Engineering

Co-authors: Jonathan Hung , Pei-Lun Liao , Lijuan Zhang , Abin Shahab , Keqiu Hu TensorFlow is one of the most popular frameworks we use to train machine learning (ML) models at LinkedIn. It allows us to develop various ML models across our platform that power relevance and matching in the news feed, advertisements, recruiting solutions, and more. To ensure the best member experience, we want our models to be accurate and up-to-date, which requires training the models as fast as possible.

Datasets 102
article thumbnail

Introducing the Well-Architected Data Lakehouse from Databricks

databricks

To provide customers with a framework for planning and implementing their data lakehouse, we are pleased to announce that we have recently published.

Data 105
article thumbnail

Repurposing Deep Learning Models using Transfer Learning in ArcGIS

ArcGIS

Re-train a deep learning model using transfer learning in ArcGIS. Start with an Esri pre-trained model, then create more training samples.

article thumbnail

How to Optimize SQL Queries for Faster Data Retrieval

KDnuggets

Today, we’ll talk about why SQL query optimization is important and which techniques can be used to optimize it.

SQL 154
article thumbnail

The Ultimate Guide to Apache Airflow DAGS

With Airflow being the open-source standard for workflow orchestration, knowing how to write Airflow DAGs has become an essential skill for every data engineer. This eBook provides a comprehensive overview of DAG writing features with plenty of example code. You’ll learn how to: Understand the building blocks DAGs, combine them in complex pipelines, and schedule your DAG to run exactly when you want it to Write DAGs that adapt to your data at runtime and set up alerts and notifications Scale you

article thumbnail

How to identify your business-critical data

Towards Data Science

How to Identify Your Business-Critical Data Practical steps to identifying business-critical data models and dashboards and drive confidence in your data Source: synq.io This article has been co-written with Lindsay Murphy Not all data is created equal. If you work in a data team you know that if a certain dashboard breaks you drop everything and jump on it, whereas other issues can wait until the end of the week.

BI 98
article thumbnail

What’s New with Databricks Notebooks

databricks

Databricks Notebooks offers developers a managed authoring experience where data and AI teams can efficiently collaborate on projects together. The team here is.

Project 105
article thumbnail

Deep Multi-task Learning and Real-time Personalization for Closeup Recommendations

Pinterest Engineering

Haomiao Li | Software Engineer, Closeup Ranking & Blending; Travis Ebesu | Software Engineer, Closeup Ranking & Blending; Fan Jiang | Software Engineer, Closeup Candidates; Jay Adams | Software Engineer, Pinner Growth & Signals; Olafur Gudmundsson | Software Engineer, Pinner Discovery; Yan Sun | Engineering Manager, Closeup Ranking & Blending; Huizhong Duan | Engineering Manager, Closeup Relevance Introduction At Pinterest, Closeup recommendations (aka Related Pins) is typically

article thumbnail

OpenAI’s Approach to AI Safety

KDnuggets

What will happen with safety approaches in AI systems after OpenAI’s CEO Sam Altman testified about the concerns around new technology?

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Blend mode helper

ArcGIS

A color card reference layer, for exploring and understanding the visual effects of blend modes in any map.

article thumbnail

Guest Post: Using Lamini to train your own LLM on your Databricks data

databricks

This is a guest post from our startup partner, Lamini. Play with this LLM pictured above, trained on Lamini documentation. Live now! You.

Data 103
article thumbnail

Window Functions?—?A must know for Data Engineers and Data Scientists

Towards Data Science

Back To Basics | SQL fundamentals for beginners Continue reading on Towards Data Science »

article thumbnail

The Effects of ChatGPT in Schools and Why It’s Getting Banned

KDnuggets

Many schools are banning ChatGPT for plagiarism, accuracy and privacy concerns. However, the chatbot could help students and teachers with the right application.

138
138
article thumbnail

How to Achieve High-Accuracy Results When Using LLMs

Speaker: Ben Epstein, Stealth Founder & CTO | Tony Karrer, Founder & CTO, Aggregage

When tasked with building a fundamentally new product line with deeper insights than previously achievable for a high-value client, Ben Epstein and his team faced a significant challenge: how to harness LLMs to produce consistent, high-accuracy outputs at scale. In this new session, Ben will share how he and his team engineered a system (based on proven software engineering approaches) that employs reproducible test variations (via temperature 0 and fixed seeds), and enables non-LLM evaluation m