Mon.Apr 22, 2024

article thumbnail

Docker Fundamentals for Data Engineers

Start Data Engineering

1. Introduction 2. Docker concepts 2.1. Define the OS and its configurations with an image 2.2. Use the image to run containers 2.2.1. Communicate between containers and local OS 2.2.2. Start containers with docker CLI or compose 3. Conclusion 1. Introduction Docker can be overwhelming to start with. Most data projects use Docker to set up the data infra locally (and often in production).

article thumbnail

5 Free Stanford University Courses to Learn Data Science

KDnuggets

Are you an aspiring data scientist? If so, these free data science courses from Stanford will help you move forward in your data science journey!

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How to test PySpark code with pytest

Start Data Engineering

1. Introduction 2. Ensure the code’s logic is working as expected with tests 2.1. Test types for data pipelines 2.2. pytest: A powerful Python library for testing 2.2.1. Set context, run code, check results & clean up 2.2.2. Tests are identified by their name 2.2.3. Use fixture to create fake data for testing 2.2.4. Define items to be shared among tests with conftest.

Coding 208
article thumbnail

How to Install Python 3 on Ubuntu [Step-by-Step Guide]

Knowledge Hut

Anyone aspiring to be a data scientist, machine learning engineer, or software developer must have thought about learning Python. The popularity of this programming language has grown exponentially in the past ten years. Even those unfamiliar with coding have probably heard about it. As per the developer survey by Stack Overflow in 2021 , approximately 68% of software developers or data scientists who have worked on developing software using Python have expressed that they will continue doing so

Python 98
article thumbnail

15 Modern Use Cases for Enterprise Business Intelligence

Large enterprises face unique challenges in optimizing their Business Intelligence (BI) output due to the sheer scale and complexity of their operations. Unlike smaller organizations, where basic BI features and simple dashboards might suffice, enterprises must manage vast amounts of data from diverse sources. What are the top modern BI use cases for enterprise businesses to help you get a leg up on the competition?

article thumbnail

Are we ready to put AI in the hands of business users? by Caitlin Salt

Scott Logic

Generative AI has been grabbing headlines, but many businesses are starting to feel left-behind. Large-model AI is becoming more and more influential in the market, and with the well-known tech giants starting to introduce easy-access AI stacks, a lot of businesses are left feeling that although there may be a use for AI in their business, they’re unable to see what use cases it might help them with.

BI 97
article thumbnail

Drawing a Blank? Understanding Drawing Alerts in ArcGIS Pro

ArcGIS

A drawing alert notification system was added in ArcGIS Pro 3.2 as a method for resolving drawing issues in your ArcGIS Pro projects.

Project 103

More Trending

article thumbnail

Magnite’s Seamless Petabyte Scale Cross-Region Migration with Snowgrid

Snowflake

Magnite stands as the largest independent sell-side advertising platform, providing an essential bridge between publishers and advertisers. At its core, Magnite streamlines the advertising process, facilitating the buying and selling of advertising space across various channels, including connected TV (CTV), mobile, and desktop environments. By leveraging advanced technology and data analytics, Magnite offers a comprehensive suite of tools and services designed to maximize ad revenue for publish

AWS 71
article thumbnail

Web Performance Regression Detection (Part 1 of 3)

Pinterest Engineering

Michelle Vu | Web Performance Engineer Detecting, preventing, and resolving performance regressions has been a standard at Pinterest for many years. Over the years, we have seen many examples showing significant business metric movements resulting from performance optimizations and regressions. These concrete examples motivate us to optimize and maintain performance.

article thumbnail

Semantic Search with Vector Databases

KDnuggets

Leverage the latest technology to improve our search engine capabilities.

Database 108
article thumbnail

Introducing Project Inception: The Next Evolution in Data Automation

Ascend.io

At Ascend, we believe it’s time to rethink data engineering from the ground up. As the world of data continues to evolve at a breakneck pace, we are thrilled to announce the next revolutionary step in our journey – Project Inception. Ascend has always been at the forefront of innovation, and with Project Inception, we’re setting a new standard.

Project 52
article thumbnail

Prepare Now: 2025s Must-Know Trends For Product And Data Leaders

Speaker: Jay Allardyce, Deepak Vittal, and Terrence Sheflin

As we look ahead to 2025, business intelligence and data analytics are set to play pivotal roles in shaping success. Organizations are already starting to face a host of transformative trends as the year comes to a close, including the integration of AI in data analytics, an increased emphasis on real-time data insights, and the growing importance of user experience in BI solutions.

article thumbnail

How to Standout and Safeguard Your Job in the Generative AI Era

KDnuggets

The secret recipe to excel in your career in AI.

110
110
article thumbnail

How Striim Enhances Healthcare at Discovery Health with Real-Time Data

Striim

Discovery Health, originating in South Africa, has transcended borders to extend its services to over 40 million customers across more than 40 global markets, encompassing regions in Asia, EMEA, and the Americas. Since its inception in 1992, the company has remained steadfast in its core purpose: “to make people healthier and to enhance and protect their lives.” As a multifaceted financial services organization, Discovery Health operates in various sectors including healthcare, life

article thumbnail

Async APIs - don't confuse your events, commands and state by David Hope

Scott Logic

In my previous blog post I looked at various technologies for sending data asynchronously between services including RabbitMQ, Kafka, AWS EventBridge. This time round I’ll look at the messages themselves which over the last few years I’ve found to be a more complex and nuanced topic than expected. To set the scene see the diagram below of an imaginary financial trading application: There’s lots of data flying around varying from real time pricing data to instructions to execute trades.

AWS 52