March, 2014

article thumbnail

Writing Python command line scripts

Zalando Engineering

Python is great for writing command line scripts and we use it a lot for internal tools and scripts at Zalando. Before extending a three line Bash script I usually rethink and implement it in Python. This post summarizes some conventions and best practices I recommend. Command Line Options Do you know the command line options of GNU tar? Probably not all of them.

Python 52
article thumbnail

Unlock Answers to the Top Questions- What is Big Data and what is Hadoop?

ProjectPro

Big data and hadoop are catch-phrases these days in the tech media for describing the storage and processing of huge amounts of data. However, while you might be familiar with what is big data and hadoop, there is high probability that other people around you are not really sure on –What is big data, what hadoop is, what big data analytics is or why it is important.

Hadoop 52
article thumbnail

5 Job Roles Available for Hadoopers

ProjectPro

With big data gaining traction in IT industry, companies are looking to hire competent hadoop skilled talent than ever before. The best way to understand about the different technical professionals working with HDFS, MapReduce and the entire Hadoop ecosystem is to have a look at various Hadoop job descriptions -which is a mixed bag ranging from developers to data scientists.

Hadoop 40
article thumbnail

Top 10 Industries using Big Data and 121 companies who hire Hadoop Developers

ProjectPro

The next decade of industries will be using Big Data to solve the unsolved data problems in the physical world. Big Data analysis will be about building systems around the data that is generated. Every department of an organization including marketing, finance and HR are now getting direct access to their own data. This is creating a huge job opportunity and there is an urgent requirement for the professionals to master Big Data Hadoop skills.

Hadoop 40
article thumbnail

Apache Airflow® Best Practices for ETL and ELT Pipelines

Whether you’re creating complex dashboards or fine-tuning large language models, your data must be extracted, transformed, and loaded. ETL and ELT pipelines form the foundation of any data product, and Airflow is the open-source data orchestrator specifically designed for moving and transforming data in ETL and ELT pipelines. This eBook covers: An overview of ETL vs.