A Systematic Literature Review of Big Data and the Hadoop frameworks

On this page

Research Article | Open Access

Volume 14 2022 | None

Devishree Naidu, Adi Thakur

Pages: 2969-2973

Abstract

Big data is a term to define the huge amount of data gathered mostly through new data sources like Twitter, Instagram, Facebook etc. This data is important as its analysis is changing how major businesses work and has the ability to provide the knowledge required to cut back business costs. Most firms are currently using this technology to accurately find trends and predict future events in their various industries. The challenge lies in finding the best way to process, analyze and draw useful insights from this data. This data cannot be handled efficiently by the traditional data management tools and hence required some other advanced data technologies. This is mainly because of its unstructured nature and the five V’s – Volume, Variety, Velocity, Value, and Veracity which we mostly use to define big data are the main reason why its handling is a major challenge. Since this data is growing at an exponential rate, it was a necessity a develop technologies to address it. Hadoop, Map Reduce, and No SQL are the major three technologies that were developed to handle the complexities of big data and manage it reliably. This paper discusses the several technologies based on Hadoop which is altogether called the Hadoop ecosystem and their uses in analyzing big data

Keywords

Big data, Flume, Map Reduce, Hadoop Ecosystem, Hadoop frameworks

PDF

132

Views

Downloads