Mon.Jul 15, 2024

article thumbnail

The software engineering industry in 2024: what changed, why, and what is next

The Pragmatic Engineer

The past 18 months have seen major change reshape the tech industry. What does it all mean for businesses and dev teams – and what will pragmatic software engineering approaches look like in the future? I tackled these burning questions in my conference talk, “What’s Old is New Again,” which was the keynote of the Craft Conference in May 2024.

article thumbnail

Landing a Data Engineer Role: Free Courses and Certifications

KDnuggets

Is it possible to learn data engineering for free? I claim it is and present the evidence for that in the form of 10 free data engineering courses.

article thumbnail

Announcing the General Availability of Serverless Compute for Notebooks, Workflows and Delta Live Tables

databricks

We are excited to announce the General Availability of serverless compute for notebooks, jobs and Delta Live Tables (DLT) on AWS and Azure.

AWS 107
article thumbnail

Convert Bytes to String in Python: A Tutorial for Beginners

KDnuggets

Strings are common built-in data types in Python. But sometimes, you may need to work with bytes instead. Let’s learn how to convert bytes to string in Python.

Bytes 128
article thumbnail

Apache Airflow® 101 Essential Tips for Beginners

Apache Airflow® is the open-source standard to manage workflows as code. It is a versatile tool used in companies across the world from agile startups to tech giants to flagship enterprises across all industries. Due to its widespread adoption, Airflow knowledge is paramount to success in the field of data engineering.

article thumbnail

Generative AI in Urban Planning

ArcGIS

Planning a city block, a neighborhood, or maybe a whole new city is a multifaceted task with no universal recipe to use. How can Generative AI help Urban Planners?

Designing 106

More Trending

article thumbnail

Explore BlueBikes ride data with ArcGIS Pro Charts

ArcGIS

In this blog article, we'll explore BlueBikes data, a bike share service in bustling Boston, and uncover hidden insights through the power of visualization

Data 104
article thumbnail

Generative AI Use Case: Using LLMs to Score Customer Conversations

Monte Carlo

Despite all the talk about AI replacing humans, Skynet blowing up the sun, and deep-fake celebrities parenting our children, it’s difficult to point to a generative AI use case that it’s demonstrably more interesting than your average run-of-the-mill chatbot. But what if instead of replacing customer support teams with chatbots, we could leverage AI to improve the performance of real human CS teams?

article thumbnail

Navigating the LLM Landscape: Uber’s Innovation with GenAI Gateway

Uber Engineering

Uber elevates tech with the GenAI Gateway, integrating Large Language Models (LLMs) for 60+ use cases, from automation to customer support. This unified platform offers easy access to models from OpenAI, Vertex AI, and Uber’s own, ensuring efficiency and security.

article thumbnail

5 Tips for Managing Data Science Teams

KDnuggets

Data scientists are still people, and these tips can improve how to manage them.

article thumbnail

Apache Airflow® Best Practices: DAG Writing

Speaker: Tamara Fingerlin, Developer Advocate

In this new webinar, Tamara Fingerlin, Developer Advocate, will walk you through many Airflow best practices and advanced features that can help you make your pipelines more manageable, adaptive, and robust. She'll focus on how to write best-in-class Airflow DAGs using the latest Airflow features like dynamic task mapping and data-driven scheduling!

article thumbnail

PySpark Explained: User-Defined Functions

Towards Data Science

What are they, and how do you use them?

article thumbnail

Understanding Develocity Build Data with Honeycomb

Pinterest Engineering

David Chang; Staff Software Engineer | Develocity, formerly known as Gradle Enterprise, is a powerful tool that speeds up local and CI build time, helps troubleshoot your builds, and analyzes your data. At Pinterest, we have a dedicated team, Mobile Builds, and we ensure that developers can build fast and often. This enables developers to be more productive by getting faster feedback on their code.

article thumbnail

Becoming a Data Owner: Roles, Responsibilities, and 4 Best Practices

Monte Carlo

Companies are discovering that their most valuable asset isn’t just data—it’s data with a custodian. Enter the data owner, a role that’s part babysitter, part asset manager, and part translator between the ones and zeros and the suits in the C-suite. It’s a position that sits at the intersection of technology, strategy, and governance—a triad that’s increasingly defining competitive advantage in our data-saturated economy.

article thumbnail

GCP Cloud Run Monitoring with Datadog

RandomTrees

Datadog is a powerful monitoring and analytics platform for modern cloud environments. It provides real-time visibility into your infrastructure, applications, and logs. Overview of Datadog Datadog is a monitoring and analytical tool for Cloud-scale applications. It can be used by IT and DevOps team which provides real-time visibility into application, monitoring the performance and health of their applications, infrastructure and services in real-time.

Cloud 52
article thumbnail

Apache Airflow® Crash Course: From 0 to Running your Pipeline in the Cloud

With over 30 million monthly downloads, Apache Airflow is the tool of choice for programmatically authoring, scheduling, and monitoring data pipelines. Airflow enables you to define workflows as Python code, allowing for dynamic and scalable pipelines suitable to any use case from ETL/ELT to running ML/AI operations in production. This introductory tutorial provides a crash course for writing and deploying your first Airflow pipeline.

article thumbnail

A Deep Dive into Data Lakes

Hevo

In this information age, there has been explosive growth in the rate and type of data generated daily. From mobile devices and IoT sensors to our online content, unprecedented amounts of data are generated. Traditional databases and warehouse technologies cannot handle this data volume and variety burst.

article thumbnail

Investigating Code Quality from PR Data by Amy Laws

Scott Logic

When a developer wants to make changes to a code base, they raise a pull request (PR) which contains the proposed changes to the code and a written summary of the changes made. Other developers will then review this PR, leaving comments or suggestions, before ultimately deciding whether to approve the changes. PRs contain valuable data which can help us to get an insight into the process of writing code, and the teams involved.

Coding 52
article thumbnail

Snowflake Horizon: Revolutionizing Data Management and Analytics

Hevo

Has it ever occurred to you that the volume of data your business processes daily is too overwhelming? You are not alone. So many companies need help in managing and analyzing enterprise data efficiently. Introducing Snowflake Horizon, the game-changing solution that will revolutionize data management and analysis.

article thumbnail

Safeguarding App Health and Consumer Experience with Metric-Aware Rollouts

DoorDash Engineering

As part of our ongoing efforts to enhance product development while safeguarding app health and the consumer experience, we are introducing metric-aware rollouts for experiments. Metric-aware rollouts refer to established decision rules to flag issues with automated checks on standardized app quality metrics during the new feature rollout process. Every action DoorDash takes focuses on enhancing the consumer experience.

article thumbnail

Optimizing The Modern Developer Experience with Coder

Many software teams have migrated their testing and production workloads to the cloud, yet development environments often remain tied to outdated local setups, limiting efficiency and growth. This is where Coder comes in. In our 101 Coder webinar, you’ll explore how cloud-based development environments can unlock new levels of productivity. Discover how to transition from local setups to a secure, cloud-powered ecosystem with ease.

article thumbnail

A Deep Dive into Data Lakes

Hevo

In this information age, there has been explosive growth in the rate and type of data generated daily. From mobile devices and IoT sensors to our online content, unprecedented amounts of data are generated. Traditional databases and warehouse technologies cannot handle this data volume and variety burst.

article thumbnail

Demystifying Data Mesh

Precisely

Key Takeaways Data Mesh is a modern data management architectural strategy that decentralizes development of trusted data products to support real-time business decisions and analytics. While a data mesh empowers domains with greater autonomy and innovation through human expertise, greater capabilities are needed by domain teams to properly access, organize, and govern data products.

article thumbnail

Apache Iceberg vs Parquet – Comparing Table and File Formats

Hevo

Apache Iceberg and Parquet are popular storage formats in the big data industry. However, they are also often confused terms. So today, we’ll compare these two storage formats, their features, and their unique capabilities. Moreover, they are not competing technologies but complementary ones.

article thumbnail

Celebrating Our Canada Office Opening

Robinhood

Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to financial information and investing. Together, we are building products and services that help create a financial system everyone can participate in. … Several members of our engineering, security, corporate engineering, and recruiting teams were recently in Toronto for our office opening in ea

article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.