Remove 2011 Remove Hadoop Remove Technology
article thumbnail

Brief History of Data Engineering

Jesse Anderson

Doug Cutting took those papers and created Apache Hadoop in 2005. Cloudera was started in 2008, and HortonWorks started in 2011. They were the first companies to commercialize open source big data technologies and pushed the marketing and commercialization of Hadoop. It gained in usage and eventually displaced Hadoop.

article thumbnail

8 Best Python Data Science Books [Beginners and Professionals]

Knowledge Hut

You can master several crucial Python data science technologies from the Python data science handbook, including Pandas, Matplotlib, NumPy, Scikit-Learn, Machine Learning, IPython, etc. The first version was launched on 30 December 2011, and the second edition was published in October 2017. This book is rated 4.16 5 stars on GoodReads.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Recap of Hadoop News for March 2018

ProjectPro

News on Hadoop - March 2018 Kyvos Insights to Host Session "BI on Big Data - With Instant Response Times" at the Gartner Data and Analytics Summit 2018.PRNewswire.com, RTInsights.com, March 15, 2018 Information Builders is letting the users of its WebFOCUS product to tap into the power of Hadoop.

Hadoop 40
article thumbnail

Data Engineers of Netflix?—?Interview with Kevin Wylie

Netflix Tech

His favorite TV shows: Ozark, Breaking Bad, Black Mirror, Barry, and Chernobyl Since I joined Netflix back in 2011, my favorite project has been designing and building the first version of our entertainment knowledge graph. When I joined Netflix back in 2011, our content analytics team was just 3 people.

article thumbnail

Cloudera + Hortonworks, from the Edge to AI

Cloudera

First, remember the history of Apache Hadoop. The two of them started the Hadoop project to build an open-source implementation of Google’s system. It staffed up a team to drive Hadoop forward, and hired Doug. Three years later, the core team of developers working inside Yahoo on Hadoop spun out to found Hortonworks.

Hadoop 75
article thumbnail

Apache Kafka – Next Generation Distributed Messaging System

ProjectPro

Apache Kafka is breaking barriers and eliminating the slow batch processing method that is used by Hadoop. Kafka was mainly developed to make working with Hadoop easier. True that it is eliminating the limitations of Hadoop – but it will not eliminate Hadoop itself. Apache Kafka attempts to solve this issue.

Kafka 40
article thumbnail

The Rise of the Data Engineer

Maxime Beauchemin

I joined Facebook in 2011 as a business intelligence engineer. This discipline also integrates specialization around the operation of so called “big data” distributed systems, along with concepts around the extended Hadoop ecosystem, stream processing, and in computation at scale. By the time I left in 2013, I was a data engineer.