Introduction to Hadoop and Map Reduce

Ad Blocker Detected

Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker.

Big Data – Introduction to Hadoop .

Hadoop is a Map Reduce framework processing large datasets in parallel, on clusters of commodity hardware. This is cheaper, as it’s a open source solution that can run on commodity hardware . It’s faster on massive data volumes as data processing is done in parallel.

A complete Hadoop MapReduce based solution may have following layers

  1. Hadoop Core – HDFS
  2. Map-Reduce API
  3. Data Access
  4. Tools and libraries

Hadoop works by splitting files into blocks and sharing them across a number nodes in a cluster. It then uses packaged code distributed across the nodes to process the data in parallel. This means that the data can be dealt with more quickly than it could be using a conventional architecture.

 

How does Hadoop and SQL compare. Watch this video for more info

Check out coursehunt.net for more courses on technology and http://www.coursehunt.net/?query=hadoop for Hadoop specific courses

Read the complete document in this slideshare

Hadoop MapReduce Fundamentals from coursehunt  (Slideshare presentation – best viewed in full screen)

Video series on Hadoop MapReduce

Leave a Reply