In this Blog we are going to talk about Hadoop. Question which comes in our mind are :
Compnies like Facebook , Google, Yahoo are datacentric company. According to the statistics, 1800 Hexabytes data get stored everyday universally.And out of these 20% only are stuctured data where as 80% of them are non structured in other word they are unorganized.
There is no software to process unstructured data. Every day the data is increasing and we require an inexpensive and relaible storage mechanism to store bulk amount of data and to process this huge amount of data.
The Ideal solution for the above requirement is to use Hadoop.
Use Cases:
The requirements are nt only for Big Companies but this will be required by the Small and Medium sized Compnies as well.For example consider some use cases like we want to
Please note that Hadoop undesrtsnds only two things:
-HDFS
-Map Reduce
Will discuss nore in details about how does It work the next blog Part II on Hadoop .
- Why do we need Hadoop?
- Do we need this when we already have tools for data processing and Query languages SQL, Integtation platform,BI Tools like Informatica?
Compnies like Facebook , Google, Yahoo are datacentric company. According to the statistics, 1800 Hexabytes data get stored everyday universally.And out of these 20% only are stuctured data where as 80% of them are non structured in other word they are unorganized.
There is no software to process unstructured data. Every day the data is increasing and we require an inexpensive and relaible storage mechanism to store bulk amount of data and to process this huge amount of data.
The Ideal solution for the above requirement is to use Hadoop.
Use Cases:
The requirements are nt only for Big Companies but this will be required by the Small and Medium sized Compnies as well.For example consider some use cases like we want to
- Analyse daly logs generated for over a period of a month or year
- process various image data uploaded by diffenet user
- Process unstructured data
- Image processing.
- Search opeartion on text, video, log data
- Analyse all the log in and log out operation happened over a year
- An open source MapReduce implementation
- Works on cluster
- Have fault tolerancecapacity:Even if one out of 1000 nodes fails, It will make a note of this and will give to abvailable nodes.
- Very much scalable:You can add any number of nodes at any point of time
- Very flexible to adpat any changes
Please note that Hadoop undesrtsnds only two things:
-HDFS
-Map Reduce
Will discuss nore in details about how does It work the next blog Part II on Hadoop .
No comments:
Post a Comment