2018, Data Storage and Hadoop - Data Engineering Digest

2018

Data Storage

Hadoop

Recap of Hadoop News for February 2018

ProjectPro

MARCH 1, 2018

News on Hadoop - February 2018 Kyvos Insights to Host Webinar on Accelerating Business Intelligence with Native Hadoop BI Platforms. PRNewswire.com, February 1, 2018. The leading big data analytics company Kyvo Insights is hosting a webinar titled “Accelerate Business Intelligence with Native Hadoop BI platforms.”

Hadoop

Hadoop NoSQL Retail BI

Your Step-by-Step Guide to Become a Data Engineer in 2025

ProjectPro

JUNE 6, 2025

If you are planning to make a career transition into data engineering and want to know how to become a data engineer, this is the perfect place to begin your journey. Beginners will especially find it helpful if they want to know how to become a data engineer from scratch. are prevalent in the industry.

Data Engineer

Data Engineer Data Engineering Engineering Amazon Web Services

Join 37,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Trending Sources

Start Data Engineering

Cloudera + Hortonworks, from the Edge to AI

Cloudera

OCTOBER 3, 2018

First, remember the history of Apache Hadoop. Google built an innovative scale-out platform for data storage and analysis in the late 1990s and early 2000s, and published research papers about their work. The two of them started the Hadoop project to build an open-source implementation of Google’s system.

Hadoop

Hadoop Cloud Data Storage Machine Learning

Webinars

What’s New in Apache Airflow® 3.0—And How Will It Reshape Your Data Workflows?

MORE WEBINARS

Hands-On Introduction to Delta Lake with (py)Spark

Towards Data Science

FEBRUARY 15, 2023

Concepts, theory, and functionalities of this modern data storage framework Photo by Nick Fewings on Unsplash Introduction I think it’s now perfectly clear to everybody the value data can have. To use a hyped example, models like ChatGPT could only be built on a huge mountain of data, produced and collected over years.

Data Lake

Data Lake Data Warehouse Data Architecture Architecture

Databricks, Snowflake and the future

Christophe Blefari

JUNE 21, 2024

Both companies have added Data and AI to their slogan, Snowflake used to be The Data Cloud and now they're The AI Data Cloud. One way to read data platforms When we look at platforms history what characterises evolution is the separation (or not) between the engine and the storage.

Metadata

Metadata Data Warehouse BI Scala

Unlock Answers to the Top Questions- What is Big Data and what is Hadoop?

ProjectPro

MARCH 17, 2014

Big data and hadoop are catch-phrases these days in the tech media for describing the storage and processing of huge amounts of data. Over the years, big data has been defined in various ways and there is lots of confusion surrounding the terms big data and hadoop. What is Big Data according to IBM?

Hadoop

Hadoop Big Data Unstructured Data Retail

How to Become a Data Engineer in 2024?

Knowledge Hut

DECEMBER 26, 2023

Most of the Data engineers working in the field enroll themselves in several other training programs to learn an outside skill, such as Hadoop or Big Data querying, alongside their Master's degree and PhDs. Hadoop Platform Hadoop is an open-source software library created by the Apache Software Foundation.

Data Engineer

Data Engineer Data Engineering Engineering Pipeline-centric

Big Data Fabric Weaves Together Automation, Scalability, and Intelligence

Cloudera

JANUARY 22, 2019

Forrester describes Big Data Fabric as, “A unified, trusted, and comprehensive view of business data produced by orchestrating data sources automatically, intelligently, and securely, then preparing and processing them in big data platforms such as Hadoop and Apache Spark, data lakes, in-memory, and NoSQL.”.

Big Data

Big Data NoSQL Data Lake Hadoop

The Future of Data Engineering and Data Engineers

Knowledge Hut

JULY 5, 2024

Big Data Boom: Fast forward to the 2000s, and Big Data crashed onto the scene. Hadoop and Spark: The cavalry arrived in the form of Hadoop and Spark, revolutionizing how we process and analyze large datasets. Suddenly, we were dealing with massive amounts of information, and traditional tools struggled to keep up.

Data Engineer

Data Engineer Data Engineering Engineering Data Cleanse

The Evolution of Table Formats

Monte Carlo

MAY 14, 2024

The “legacy” table formats The data landscape has evolved so quickly that table formats pioneered within the last 25 years are already achieving “legacy” status. It was designed to support high-volume data exchange and compatibility across different system versions, which is essential for streaming architectures such as Apache Kafka.

Data Lake

Data Lake Metadata Hadoop Data Governance

Top Data Lake Vendors (Quick Reference Guide)

Monte Carlo

APRIL 24, 2023

Data lakes are useful, flexible data storage repositories that enable many types of data to be stored in its rawest state. Notice how Snowflake dutifully avoids (what may be a false) dichotomy by simply calling themselves a “data cloud.” Not to mention seamless integration with the Oracle ecosystem.

Data Lake

Data Lake Google Cloud Data Warehouse Cloud Storage

Top 8 Data Engineering Books [Beginners to Advanced]

Knowledge Hut

JUNE 30, 2023

Key Benefits and Takeaways: Understand data intake strategies and data transformation procedures by learning data engineering principles with Python. Investigate alternative data storage solutions, such as databases and data lakes. Key Benefits and Takeaways: Learn the core concepts of big data systems.

Data Engineer

Data Engineer Data Engineering Engineering Data Warehouse

Big Data Timeline- Series of Big Data Evolution

ProjectPro

AUGUST 26, 2015

The largest item on Claude Shannon’s list of items was the Library of Congress that measured 100 trillion bits of data. 1960 - Data warehousing became cheaper. 1996 - Digital data storage became cost effective than paper - according to R.J.T. 2008 -Google processed 20 petabytes of data in a single day.

Big Data

Big Data Unstructured Data Hadoop NoSQL

Data Lake vs. Data Warehouse vs. Data Lakehouse

Sync Computing

NOVEMBER 7, 2024

Estimates vary, but the amount of new data produced, recorded, and stored is in the ballpark of 200 exabytes per day on average, with an annual total growing from 33 zettabytes in 2018 to a projected 169 zettabytes in 2025. In case you dont know your metrics, these numbers are astronomical!

Data Lake

Data Lake Data Warehouse Business Intelligence Unstructured Data

Top Careers in AI And Machine Learning For 2025

ProjectPro

JUNE 6, 2025

Data Engineer: Key Responsibilities Some of the day-to-day responsibilities of a big data engineer include- Data Pipeline Design and Development- Building and maintaining pipelines to gather and load raw (structured/unstructured) data from various sources. SQL, NoSQL) are essential.

Machine Learning

Machine Learning Computer Science Consulting Software Engineering

Q&A with Greg Rahn – The changing Data Warehouse market

Cloudera

DECEMBER 12, 2018

I spent eight years in the real-world performance group where I specialized in high visibility and high impact data warehousing competes and benchmarks. Greg Rahn: Toward the end of that eight-year stint, I saw this thing coming up called Hadoop and an engine called Hive. There’s MongoDB for document stores.

Data Warehouse

Data Warehouse Relational Database Hadoop BI

RocksDB Is Eating the Database World

Rockset

JANUARY 23, 2020

While traditional RDBMS databases served well the data storage and data processing needs of the enterprise world from their commercial inception in the late 1970s until the dotcom era, the large amounts of data processed by the new applications—and the speed at which this data needs to be processed—required a new approach.

Database

Database MySQL Kafka NoSQL

What is the Learning Path to Become an AWS Certified Solutions Architect Associate?

Knowledge Hut

NOVEMBER 16, 2023

Data Storage As a Solutions Architect, you must have knowledge of databases. There are several data storage options available on the AWS platform. This includes powerful and simple bucket storage like S3, relational database service, and Hadoop clusters.

AWS

AWS Cloud Computing Certification Architecture

Top 15 Cloud Computing Projects Ideas for Beginners in 2023

ProjectPro

JULY 15, 2021

According to an Indeed Jobs report, the share of cloud computing jobs has increased by 42% per million from 2018 to 2021. It is recommended to use SQL database for data storage as it comes with built-in security tools and features. Decision-making and support will be performed using data mining and feature extraction.

Cloud Computing

Cloud Computing Cloud Project Banking

Recap of Hadoop News for February 2018

Your Step-by-Step Guide to Become a Data Engineer in 2025

Webinars

Trending Sources

Cloudera + Hortonworks, from the Edge to AI

Webinars

Hands-On Introduction to Delta Lake with (py)Spark

Databricks, Snowflake and the future

Unlock Answers to the Top Questions- What is Big Data and what is Hadoop?

How to Become a Data Engineer in 2024?

Big Data Fabric Weaves Together Automation, Scalability, and Intelligence

The Future of Data Engineering and Data Engineers

The Evolution of Table Formats

Top Data Lake Vendors (Quick Reference Guide)

Top 8 Data Engineering Books [Beginners to Advanced]

Big Data Timeline- Series of Big Data Evolution

Data Lake vs. Data Warehouse vs. Data Lakehouse

Top Careers in AI And Machine Learning For 2025

Q&A with Greg Rahn – The changing Data Warehouse market

RocksDB Is Eating the Database World

What is the Learning Path to Become an AWS Certified Solutions Architect Associate?

Top 15 Cloud Computing Projects Ideas for Beginners in 2023

Stay Connected