Step 1 : Install PIG from Cloudera repository
Step 2 : For each user who will be submitting MapReduce jobs using MapReduce v1 (MRv1), or running Pig, Hive, or Sqoop in an MRv1 installation, set the HADOOP_MAPRED_HOME environment variable as follows: [In case it is not already updated]
$ sudo gedit .bashrc
Step 3 : To start Pig in interactive mode (MRv1)
Step 4 : Examples
$ sudo apt-get install pig
Step 2 : For each user who will be submitting MapReduce jobs using MapReduce v1 (MRv1), or running Pig, Hive, or Sqoop in an MRv1 installation, set the HADOOP_MAPRED_HOME environment variable as follows: [In case it is not already updated]
$ sudo gedit .bashrc
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce
Step 3 : To start Pig in interactive mode (MRv1)
$ pig
Step 4 : Examples
grunt> ls hdfs://localhost/user/joe/input <dir>
grunt> A = LOAD 'input'; grunt> B = FILTER A BY $0 MATCHES '.*dfs[a-z.]+.*'; grunt> DUMP B;
[For this example to run you need input directory to be created. Incase you
already have not created it in our previous mentioned steps of Hadoop Installation
please create it:
$ sudo -u hdfs hadoop fs -mkdir -p /user/$USER $ sudo -u hdfs hadoop fs -chown $USER /user/$USER $ hadoop fs -mkdir input $ hadoop fs -put /etc/hadoop/conf/*.xml input $ hadoop fs -ls input ]
No comments:
Post a Comment