709 33 40 606 / 303 contact@cedarinfotech.com

In my last blog I shared about how Apache Kafka is key in success of modern business. In this blog I share about another technology that is inevitable due to large volume of data.

When you think of large volume of data there are 2 things that we need to handle it:

1) How and where to store it

2) How to process it.

Hadoop is an open-source platform for such a situation. And now this is common in all successful businesses. Hadoop solves t storage problem using its Distributed Storage. And the problem of ‘processing’ using compute cluster for large datasets.

For distributed storage Hadoop uses MapReduce. And for distributed storage it uses Hadoop Distributed File System(HDFS)

Hadoop is used in areas of large amount of data which requires processing and storage. As Hadoop uses Horizontal Scaling it has the ability to perform parallel processing.

Areas where Hadoop is a fit:

  • Cyber Security and Fraud Detection: In Government and large businesses
  • Healthcare: Due to its nature Healthcare needs fail-safe in case of failure
  • Ad Targeting: Hadoop can store, capture and process data generated by social media or other tools which track user behaviors and current state of mind to target ads