Sat.Jan 14, 2023 - Fri.Jan 20, 2023

article thumbnail

Replacing Pandas with Polars. A Practical Guide.

Confessions of a Data Guy

I remember those days, oh so long ago, it seems like another lifetime. I haven’t used Pandas in many a year, decades, or whatever. We’ve all been there, done that. Pandas I mean. I would dare say it’s a rite of passage for most data folk. For those using Python, it’s probably one of the […] The post Replacing Pandas with Polars.

Python 361
article thumbnail

How To Hire Junior Data Engineers

Seattle Data Guy

With all the recent data events I have put together I inevitably run into new data engineers who are either finishing up college or looking to transition into a data engineer or data scientist position. In fact I have talked to several newly graduated engineers who are struggling to find work. A few told me… Read more The post How To Hire Junior Data Engineers appeared first on Seattle Data Guy.

article thumbnail

ChatGPT as a Python Programming Assistant

KDnuggets

Is ChatGPT useful for Python programmers, specifically those of us who use Python for data processing, data cleaning, and building machine learning models? Let's give it a try and find out.

Python 160
article thumbnail

What Big Tech layoffs suggest for the industry

The Pragmatic Engineer

👋 Hi, this is Gergely with a bonus, free issue of the Pragmatic Engineer Newsletter. We cover one out of five topics in today’s subscriber-only The Scoop issue. To get the full issues, twice a week: subscribe here. Update on 20 January: less than a day after publishing this article, Google announced historic layoffs that will impact ~12,000 positions.

Banking 142
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.

article thumbnail

Data News — Week 23.03

Christophe Blefari

Summer in coming ( credits ) Hey, new Friday, new Data News edition. I'm so happy to see new people coming every week. Thank you for every recommendation you do about the blog or the Data News. This kindness for my content gives me wings. This week I don't want to be late, so let's start the weekly wrap-up. I got less inspired this week, it means shorter edition.

article thumbnail

What Is The State Of Data Engineering And Infrastructure In 2023

Seattle Data Guy

2022 is coming to an end. What is the state of data infra? Are Snowflake and Databricks still fighting over total cost of ownership? Is everyone switching to DuckDB? Are data engineers all learning Rust? Let’s try to answer these questions. Our team is putting together an all day event focused on helping answer some… Read more The post What Is The State Of Data Engineering And Infrastructure In 2023 appeared first on Seattle Data Guy.

More Trending

article thumbnail

Building Applications With Data As Code On The DataOS

Data Engineering Podcast

Summary The modern data stack has made it more economical to use enterprise grade technologies to power analytics at organizations of every scale. Unfortunately it has also introduced new overhead to manage the full experience as a single workflow. At the Modern Data Company they created the DataOS platform as a means of driving your full analytics lifecycle through code, while providing automatic knowledge graphs and data discovery.

Coding 130
article thumbnail

Data News — Week 23.02

Christophe Blefari

Abandoned Pandas ( credits ) Hey. I have busy weeks, I'm sorry Data News are coming on Saturday again. This is a bit hard to travel by train, work and write at the same time. Plus I'm a fast context switcher, so it piles up. Also a few of you have sent me messages recently and I've not yet answered, I see you and I did not forget you.

Python 130
article thumbnail

Why You Should Simplify Your Data Infrastructure

Seattle Data Guy

Good Design Is Easier to Change Than Bad Design – The Pragmatic Programmer Programming is just one aspect of the difficulties of tech work for data engineers. Creating simple yet robust systems that help manage your data infrastructure is equally important. This challenge of building a simple yet robust data infrastructure remains even with no-code/low-code solutions.

Data 130
article thumbnail

How to Use Python and Machine Learning to Predict Football Match Winners

KDnuggets

We will be learning web scraping and training supervised machine-learning algorithms to predict winning teams.

article thumbnail

Apache Airflow®: The Ultimate Guide to DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

Driving Data, Delivering Value: Data Leaders to Watch in 2023

Snowflake

The Chief Data Officer is arguably one of the most important roles at a company, particularly those that aspire to be data-driven. CDO appointments and the elevation of data leaders have accelerated in recent years, and the role has morphed as perceptions of data have evolved. Responsibilities span strategy and execution, people and processes, and the technology needed to deliver on the promise of data.

Data 111
article thumbnail

Devpod: Improving Developer Productivity at Uber with Remote Development

Uber Engineering

In this blog, we share how we improved the daily edit-build-run developer experience using DevPods, Uber’s remote development environment. We cover the challenges, pain points, our architecture, and lastly the future of remote development at Uber.

article thumbnail

New Built-in Functions for Databricks SQL

databricks

Built-in functions extend the power of SQL with specific transformations of values for common needs and use cases. For example, the LOG10 function.

SQL 98
article thumbnail

ChatGPT: Everything You Need to Know

KDnuggets

All you need to know about ChatGPT: what it can do, how it works, and its limitations.

IT 148
article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

The Insurance Industry is Ready for a lot More Change

Teradata

The dwindling personal auto insurance market is a harbinger of a lot more change to come. Find out more.

article thumbnail

Reducing Logging Cost by Two Orders of Magnitude using CLP

Uber Engineering

Uber’s Data team discusses how they used CLP to scale log ingestion, retention, and analytics for Petabytes of Spark logs, reducing log storage and management costs by 169x.

article thumbnail

Easy Ingestion to Lakehouse With COPY INTO

databricks

A new data management architecture known as the data lakehouse emerged independently across many organizations and use cases to support AI and BI.

BI 98
article thumbnail

SQL and Data Integration: ETL and ELT

KDnuggets

In this article, we will discuss use cases and methods for using ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes along with SQL to integrate data from various sources.

SQL 137
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Data Integrity Trends for 2023

Precisely

For most enterprises, 2022 was a year of transition, as companies struggled to figure out how to accomplish more with fewer resources. Technology helped to bridge the gap, as AI, machine learning, and data analytics drove smarter decisions, and automation paved the way for greater efficiency. Data integrity trends for 2023, has agility toping the list of success factors for most firms, as business leaders focus on rapid time to value and an emphasis on responding quickly to emerging opportunitie

article thumbnail

Uber’s Next Gen Push Platform on gRPC

Uber Engineering

Uber’s API platform team talks about how they built their Next Generation Push Platform on gRPC which helped improve the reliability and latency of messages significantly.

98
article thumbnail

What’s New With SQL User-Defined Functions

databricks

Since their initial release, SQL user-defined functions have become hugely popular among both Databricks Runtime and Databricks SQL customers. This simple yet powerful.

SQL 98
article thumbnail

Fast-track your next move with in-demand data skills

KDnuggets

DataCamp offers over 400 interactive courses, projects, and career tracks in the most popular data technologies such as Python, SQL, R, Power BI, and Tableau. Start today and save up to 67% on career-advancing learning.

BI 137
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, Terrence Sheflin, and Mahyar Ghasemali

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

How to make this 3D diorama of the Straits of Mackinac

ArcGIS

Here's one way to make these fun and intriguing micro-world cutaway sorts of things!

96
article thumbnail

How Uber Optimizes the Timing of Push Notifications using ML and Linear Programming

Uber Engineering

The Uber Eats team shares how they built a novel system with machine learning and linear programming to send the right message at the right time to its users.

article thumbnail

New! Diversity, equity, and inclusion analysis SpotApp helps businesses improve employee diversity

ThoughtSpot

Tech has a diversity problem. As a veteran People leader, I see and hear about it all the time — in media , in the board room, and in my daily work. And yet, as much as our industry is known for solving large-scale problems and disrupting the status quo, improvement in this area doesn’t seem to be happening fast enough. Why not? When I look at companies leading our industry in DEI, there’s one thing that stands out: data.

article thumbnail

Scaling Data Management Through Apache Gobblin

KDnuggets

Software companies can manage big data at a hyper-scale on different infrastructure stacks using Apache Gobblin.

article thumbnail

How to Drive Cost Savings, Efficiency Gains, and Sustainability Wins with MES

Speaker: Nikhil Joshi, Founder & President of Snic Solutions

Is your manufacturing operation reaching its efficiency potential? A Manufacturing Execution System (MES) could be the game-changer, helping you reduce waste, cut costs, and lower your carbon footprint. Join Nikhil Joshi, Founder & President of Snic Solutions, in this value-packed webinar as he breaks down how MES can drive operational excellence and sustainability.

article thumbnail

How to add an inner band of color to polygons in ArcGIS Pro

ArcGIS

Here's how you can add a ribbon of color to the inside of polygons, without rendering gaps or jaggies.

90
article thumbnail

Deduping and Storing Images at Uber Eats

Uber Engineering

Our engineers discuss how we dedupe and store millions of product images at Uber Eats using a content-addressable caching layer, which saves millions of image downloads every hour and ensures that every image is only stored once.

article thumbnail

Functional Python, Part II: Dial M for Monoid

Tweag

Tweagers have an engineering mantra — Functional. Typed. Immutable. — that begets composable software which can be reasoned about and avails itself to static analysis. These are all “good things” for building robust software, which inevitably lead us to using languages such as Haskell, OCaml and Rust. However, it would be remiss of us to snub languages that don’t enforce the same disciplines, but are nonetheless popular choices in industry.

Python 90
article thumbnail

Encoding Categorical Features with MultiLabelBinarizer

KDnuggets

Transform multi-label format into a binary matrix for multi-label classification.

Process 122
article thumbnail

The Cloud Development Environment Adoption Report

Cloud Development Environments (CDEs) are changing how software teams work by moving development to the cloud. Our Cloud Development Environment Adoption Report gathers insights from 223 developers and business leaders, uncovering key trends in CDE adoption. With 66% of large organizations already using CDEs, these platforms are quickly becoming essential to modern development practices.