Remove Lambda Architecture Remove MySQL Remove Unstructured Data
article thumbnail

Apache Spark Use Cases & Applications

Knowledge Hut

Though the majority of use cases of Spark uses HDFS as the underlying data file storage layer, it is not mandatory to use HDFS. It does work with a variety of other Data sources like Cassandra, MySQL, AWS S3 etc. Streaming Data: Streaming is basically unstructured data produced by different types of data sources.

Scala 52
article thumbnail

20+ Data Engineering Projects for Beginners with Source Code

ProjectPro

Thus, as a learner, your goal should be to work on projects that help you explore structured and unstructured data in different formats. Data Warehousing: Data warehousing utilizes and builds a warehouse for storing data. A data engineer interacts with this warehouse almost on an everyday basis.