PIG Installation on Ubuntu

Step 1 : Install PIG from Cloudera repository

$ sudo apt-get install pig

Step 2 : For each user who will be submitting MapReduce jobs using MapReduce v1 (MRv1), or running Pig, Hive, or Sqoop in an MRv1 installation, set the HADOOP_MAPRED_HOME environment variable as follows: [In case it is not already updated]

$ sudo gedit .bashrc
 
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce

Step 3 : To start Pig in interactive mode (MRv1)

$ pig


Step 4 :  Examples

grunt> ls
hdfs://localhost/user/joe/input <dir>
grunt> A = LOAD 'input';
grunt> B = FILTER A BY $0 MATCHES '.*dfs[a-z.]+.*';
grunt> DUMP B; 
 
[For this example to run you need input directory to be created. Incase you
already have not created it in our previous mentioned steps of Hadoop Installation
 please create it:
 
$ sudo -u hdfs hadoop fs -mkdir -p /user/$USER

$ sudo -u hdfs hadoop fs -chown $USER /user/$USER

$ hadoop fs -mkdir input

$ hadoop fs -put /etc/hadoop/conf/*.xml input

$ hadoop fs -ls input ] 

No comments:

Post a Comment