Hive and Pig Training

Hadoop Training Chennai

Hadoop Training Chennai offers Hive and Pig are higher-level abstractions on top of MapReduce that allow those without Java programming knowledge to manage and manipulate data in a Hadoop cluster. Apache Hive and Pig is designed for people with a basic understanding of how Hadoop works and who want to use these languages for analysis of their data.

hadoop training chennai

You Will Learn

  • How Hive augments MapReduce
  • How to create and manipulate tables using Hive
  • Hive’s basic and advanced data types
  • Partitioning and bucketing data with Hive
  • Advanced features of Hive
  • How to load and manipulate data using Pig
  • Features of the PigLatin programming language
  • Solving real-world problems with Pig

Hive and Pig Training: Prerequisites

This course is suitable for developers, data analysts and business analysts. Experience with SQL and scripting languages is recommended, but is not required. No pre­‐existing knowledge of Hadoop is required.

Outline

  • Introduction
  • Introduction to Hadoop and Hive
  • Getting Data into Hive
  • Manipulating Data with Hive
  • Partitioning and Bucketing Data
  • Advanced Hive Features
  • Hive Best Practices
  • Introduction to Pig
  • Pig’s Architecture
  • Reading and Writing Data with Pig
  • Advanced Pig Latin
  • Debugging Pig Scripts
  • Pig Best Practices

Hive and Pig

Hive: data warehousing application in Hadoop

  • Query language is HQL, variant of SQL
  • Tables stored on HDFS as flat files
  • Developed by Facebook, now open source

Pig: large Pig: large

  • scale data processing system scale data processing system
  • Scripts are written in Pig Latin, a dataflow language
  • Developed by Yahoo!, now open source
  • Roughly 1/3 of all Yahoo! internal jobs

Common idea:

  • Provide higher-level language to facilitate large-data processing
  • Higher-level language “compiles down” to Hadoop jobs
Ver peliculas online