The emergence of massive unstructured data sources like Facebook and Twitter has created a need to develop distributed processing systems for Big Data Analytics. Hadoop (A Java based programming framework) has become the first choice of developers and industry experts mainly because its: Highly scalable, flexible, and cheap. An application is broken down into various small parts which runs on thousands of nodes to achieve fast computing speed and reduce overall operation time. Hadoop architecture continues to operate even if a node fails. Its incredible design allows you to process large volumes of data and extract computationally difficult features of users/customers.

Read more at : http://www.datasciencecentral.com/forum/topics/how-to-use-hadoop-for-data-science