On this page
Research Article | Open Access
Volume 14 2022 | None
A Systematic Literature Review of Big Data and the Hadoop frameworks
Devishree Naidu, Adi Thakur
Pages: 2969-2973
Abstract
Big data is a term to define the huge amount of data gathered mostly through new data sources like Twitter, Instagram, Facebook etc. This data is important as its analysis is changing how major businesses work and has the ability to provide the knowledge required to cut back business costs. Most firms are currently using this technology to accurately find trends and predict future events in their various industries. The challenge lies in finding the best way to process, analyze and draw useful insights from this data. This data cannot be handled efficiently by the traditional data management tools and hence required some other advanced data technologies. This is mainly because of its unstructured nature and the five V’s – Volume, Variety, Velocity, Value, and Veracity which we mostly use to define big data are the main reason why its handling is a major challenge. Since this data is growing at an exponential rate, it was a necessity a develop technologies to address it. Hadoop, Map Reduce, and No SQL are the major three technologies that were developed to handle the complexities of big data and manage it reliably. This paper discusses the several technologies based on Hadoop which is altogether called the Hadoop ecosystem and their uses in analyzing big data
Keywords
Big data, Flume, Map Reduce, Hadoop Ecosystem, Hadoop frameworks
PDF
132
Views
64
Downloads