Installing Pig

To install Pig On Red Hat-compatible systems:

$ sudo yum install pig

To install Pig on SLES systems:

$ sudo zypper install pig

To install Pig on Ubuntu and other Debian systems:

$ sudo apt-get install pig

Note:

Pig automatically uses the active Hadoop configuration (whether standalone, pseudo-distributed mode, or distributed). After installing the Pig package, you can start the grunt shell.

To start the Grunt Shell (MRv1):

$ export PIG_CONF_DIR=/usr/lib/pig/conf
$ export PIG_CLASSPATH=/usr/lib/hbase/hbase-0.94.2-cdh4.2.1-security.jar:

/usr/lib/zookeeper/zookeeper-3.4.5-cdh4.2.1.jar
$ pig 

grunt>

To start the Grunt Shell (YARN):

Important:

For each user who will be submitting MapReduce jobs using MapReduce v2 (YARN), or running Pig, Hive, or Sqoop in a YARN installation, set the HADOOP_MAPRED_HOME environment variable as follows:

$ export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce

$ export PIG_CONF_DIR=/usr/lib/pig/conf
$ export PIG_CLASSPATH=/usr/lib/hbase/hbase-0.94.2-cdh4.2.1-security.jar:

/usr/lib/zookeeper/zookeeper-3.4.5-cdh4.2.1.jar
$ pig 
...
grunt>

To verify that the input and output directories from the example grep job exist list an HDFS directory from the Grunt Shell:

grunt> ls
hdfs://localhost/user/joe/input <dir>
hdfs://localhost/user/joe/output <dir>

To run a grep example job using Pig for grep inputs:

grunt> A = LOAD 'input';
grunt> B = FILTER A BY $0 MATCHES '.*dfs[a-z.]+.*';
grunt> DUMP B;

To check the status of your job while it is running, look at the JobTracker web console http://localhost:50030/.

Hadoop for Beginner

HTML/JavaScript

Installing Pig in Hadoop

Installing Pig

No comments:

Post a Comment

HTML/JavaScript

document.write(ssyby);

Installing Pig in Hadoop

Installing Pig

No comments:

Post a Comment