Hadoop

Bigdata Hadoop HDFS
Published on : Aug-2016
more_vert Sreenivasulu Akkem
Bigdata Hadoop HDFSclose

The Hadoop Distributed File System (HDFS) is highly fault-tolerant and is designed to store very large data sets on low-cost hardware reliably, and to stream those data sets at high bandwidth to user applications.

Go to Course
MapReduce
Published on : Jul-2016
more_vert Sreenivasulu Akkem
MapReduceclose

MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster.

Go to Course
Pig
Published on : Jul-2016
more_vert Sreenivasulu Akkem
Pigclose

Apache Pig is a high-level platform for creating programs that run on Apache Hadoop. The language for this platform is called Pig Latin. Pig can execute its Hadoop jobs in MapReduce, Apache Tez, or Apache Spark.

Go to Course
Hive
Published on : Jul-2016
more_vert Sreenivasulu Akkem
Hiveclose

Apache Hive is an open-source data warehouse system for querying and analyzing large datasets stored in Hadoop files, originally developed by Facebook. It allows users to write queries in a SQL-like language caled HiveQL, which are then converted to MapReduce.

Go to Course
Sqoop
Published on : Jul-2016
more_vert Sreenivasulu Akkem
Sqoopclose

Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and export from Hadoop file system to relational databases.

Go to Course
Flume
Published on : Jul-2016
more_vert Sreenivasulu Akkem
Flumeclose

A service for streaming logs into Hadoop. Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of streaming data into the Hadoop Distributed File System (HDFS).

Go to Course
Oozie
Published on : Jul-2016
more_vert Sreenivasulu Akkem
Oozieclose

Oozie is a workflow scheduler system to manage Apache Hadoop jobs. OozieWorkflow jobs are Directed Acyclical Graphs (DAGs) of actions. Oozie Coordinator jobs are recurrent Oozie Workflow jobs triggered by time (frequency) and data availabilty.

Go to Course
Zookeeper
Published on : Aug-2016
more_vert Sreenivasulu Akkem
Zookeeperclose

Apache ZooKeeper is a software project of the Apache Software Foundation, providing an open source distributed configuration service, synchronization service, and naming registry for large distributed systems. ZooKeeper was a sub-project of Hadoop but is now a top-level project in its own right.

Go to Course
Spark
Published on : Aug-2016
more_vert Sreenivasulu Akkem
Sparkclose

Apache Spark is an open source big data processing framework built around speed, ease of use, and sophisticated analytics. It was originally developed in 2009 in UC Berkeley's AMPLab, and open sourced in 2010 as an Apache project.

Go to Course
Hadoop FAQ
Published on : Aug-2016
more_vert Sreenivasulu Akkem
Hadoop FAQclose

The Hadoop FAQ section does help you provide answers to frequently asked questions related to Hadoop technology. It will helps Job seekers to crack Job.

Go to Course