Step 1 :- Install Hue
Step 2 :- Configuring Hue
2.1. For WebHDFS only:
2.1.1. Add the following property in hdfs-site.xml to enable WebHDFS in the NameNode and DataNodes:
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
Restart your HDFS cluster.
Configure Hue as a proxy user for all other users and groups, meaning it may submit a request on behalf of any other user:
2.1.2. WebHDFS: Add to core-site.xml:
<!-- Hue WebHDFS proxy user setting -->
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
2.1.3. With root privileges, update hadoop.hdfs_clusters.default.webhdfs_url in hue.ini to point to the address of either WebHDFS or HttpFS.
[hadoop]
[[hdfs_clusters]]
[[[default]]]
# Use WebHdfs/HttpFs as the communication mechanism.
WebHDFS:
...
webhdfs_url=http://FQDN:50070/webhdfs/v1/
2.2. MRv1 Configuration
Hue communicates with the JobTracker via the Hue plugin, which is a .jar file that should be placed in your MapReduce lib directory.
2.2.1. If your JobTracker and Hue Server are located on the same host, copy the file over. If you are currently using CDH 4, your MapReduce library directory might be in /usr/lib/hadoop/lib.
$ cd /usr/lib/hue
$ cp desktop/libs/hadoop/java-lib/hue-plugins-*.jar /usr/lib/hadoop-0.20-mapreduce/lib
If your JobTracker runs on a different host, scp the Hue plugins .jar file to the JobTracker host.
2.2.2. Add the following properties to mapred-site.xml:
<property>
<name>jobtracker.thrift.address</name>
<value>0.0.0.0:9290</value>
</property>
<property>
<name>mapred.jobtracker.plugins</name>
<value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
<description>Comma-separated list of jobtracker plug-ins to be activated.</description>
</property>
You can confirm that the plugins are running correctly by tailing the daemon logs:
$ tail --lines=500 /var/log/hadoop-0.20-mapreduce/hadoop*jobtracker*.log | grep ThriftPlugin
2009-09-28 16:30:44,337 INFO org.apache.hadoop.thriftfs.ThriftPluginServer: Starting Thrift server
2009-09-28 16:30:44,419 INFO org.apache.hadoop.thriftfs.ThriftPluginServer:
Thrift server listening on 0.0.0.0:9290
2.3. Hive Configuration
The Beeswax daemon has been replaced by HiveServer2. Hue should therefore point to a running HiveServer2. This change involved the following major updates to the [beeswax] section of the Hue configuration file, hue.ini.
[beeswax]
# Host where Hive server Thrift daemon is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
## hive_server_host=<FQDN of HiveServer2>
# Port where HiveServer2 Thrift server runs on.
## hive_server_port=10000
Existing Hive Installation
In the Hue configuration file hue.ini, modify hive_conf_dir to point to the directory containing hive-site.xml.
2.4. HADOOP_CLASSPATH
If you are setting $HADOOP_CLASSPATH in your hadoop-env.sh, be sure to set it in such a way that user-specified options are preserved. For example:
Correct:
# HADOOP_CLASSPATH=<your_additions>:$HADOOP_CLASSPATH
Incorrect:
# HADOOP_CLASSPATH=<your_additions>
This enables certain components of Hue to add to Hadoop's classpath using the environment variable.
2.5. hadoop.tmp.dir
If your users are likely to be submitting jobs both using Hue and from the same machine via the command line interface, they will be doing so as the hue user when they are using Hue and via their own user account when they are using the command line. This leads to some contention on the directory specified by hadoop.tmp.dir, which defaults to /tmp/hadoop-${user.name}. Specifically, hadoop.tmp.dir is used to unpack JARs in /usr/lib/hadoop. One work around to this is to set hadoop.tmp.dir to /tmp/hadoop-${user.name}-${hue.suffix} in the core-site.xml file:
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}-${hue.suffix}</value>
</property>
Unfortunately, when the hue.suffix variable is unset, you'll end up with directories named /tmp/hadoop-user.name-${hue.suffix} in /tmp. Despite that, Hue will still work.
Step 3 :- Hue.ini configuration is completely available in cloudera.
On Ubuntu or Debian systems:
- On the Hue Server machine, install the hue package:
$ sudo apt-get install hue
- For MRv1: on the system that hosts the JobTracker, if different from the Hue server machine, install the hue-plugins package:
$ sudo apt-get install hue-plugins
Step 2 :- Configuring Hue
2.1. For WebHDFS only:
2.1.1. Add the following property in hdfs-site.xml to enable WebHDFS in the NameNode and DataNodes:
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
Restart your HDFS cluster.
Configure Hue as a proxy user for all other users and groups, meaning it may submit a request on behalf of any other user:
2.1.2. WebHDFS: Add to core-site.xml:
<!-- Hue WebHDFS proxy user setting -->
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
2.1.3. With root privileges, update hadoop.hdfs_clusters.default.webhdfs_url in hue.ini to point to the address of either WebHDFS or HttpFS.
[hadoop]
[[hdfs_clusters]]
[[[default]]]
# Use WebHdfs/HttpFs as the communication mechanism.
WebHDFS:
...
webhdfs_url=http://FQDN:50070/webhdfs/v1/
2.2. MRv1 Configuration
Hue communicates with the JobTracker via the Hue plugin, which is a .jar file that should be placed in your MapReduce lib directory.
2.2.1. If your JobTracker and Hue Server are located on the same host, copy the file over. If you are currently using CDH 4, your MapReduce library directory might be in /usr/lib/hadoop/lib.
$ cd /usr/lib/hue
$ cp desktop/libs/hadoop/java-lib/hue-plugins-*.jar /usr/lib/hadoop-0.20-mapreduce/lib
If your JobTracker runs on a different host, scp the Hue plugins .jar file to the JobTracker host.
2.2.2. Add the following properties to mapred-site.xml:
<property>
<name>jobtracker.thrift.address</name>
<value>0.0.0.0:9290</value>
</property>
<property>
<name>mapred.jobtracker.plugins</name>
<value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value>
<description>Comma-separated list of jobtracker plug-ins to be activated.</description>
</property>
You can confirm that the plugins are running correctly by tailing the daemon logs:
$ tail --lines=500 /var/log/hadoop-0.20-mapreduce/hadoop*jobtracker*.log | grep ThriftPlugin
2009-09-28 16:30:44,337 INFO org.apache.hadoop.thriftfs.ThriftPluginServer: Starting Thrift server
2009-09-28 16:30:44,419 INFO org.apache.hadoop.thriftfs.ThriftPluginServer:
Thrift server listening on 0.0.0.0:9290
2.3. Hive Configuration
The Beeswax daemon has been replaced by HiveServer2. Hue should therefore point to a running HiveServer2. This change involved the following major updates to the [beeswax] section of the Hue configuration file, hue.ini.
[beeswax]
# Host where Hive server Thrift daemon is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
## hive_server_host=<FQDN of HiveServer2>
# Port where HiveServer2 Thrift server runs on.
## hive_server_port=10000
Existing Hive Installation
In the Hue configuration file hue.ini, modify hive_conf_dir to point to the directory containing hive-site.xml.
2.4. HADOOP_CLASSPATH
If you are setting $HADOOP_CLASSPATH in your hadoop-env.sh, be sure to set it in such a way that user-specified options are preserved. For example:
Correct:
# HADOOP_CLASSPATH=<your_additions>:$HADOOP_CLASSPATH
Incorrect:
# HADOOP_CLASSPATH=<your_additions>
This enables certain components of Hue to add to Hadoop's classpath using the environment variable.
2.5. hadoop.tmp.dir
If your users are likely to be submitting jobs both using Hue and from the same machine via the command line interface, they will be doing so as the hue user when they are using Hue and via their own user account when they are using the command line. This leads to some contention on the directory specified by hadoop.tmp.dir, which defaults to /tmp/hadoop-${user.name}. Specifically, hadoop.tmp.dir is used to unpack JARs in /usr/lib/hadoop. One work around to this is to set hadoop.tmp.dir to /tmp/hadoop-${user.name}-${hue.suffix} in the core-site.xml file:
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop-${user.name}-${hue.suffix}</value>
</property>
Unfortunately, when the hue.suffix variable is unset, you'll end up with directories named /tmp/hadoop-user.name-${hue.suffix} in /tmp. Despite that, Hue will still work.
Step 3 :- Hue.ini configuration is completely available in cloudera.
After reading this blog i very strong in this topics and this blog really helpful to all... explanation are very clear so very easy to understand... thanks a lot for sharing this blog
ReplyDeletehadoop training institute in velachery | big data training institute in velachery
Awesome post sir,
ReplyDeletereally appreciate for your writing. This blog is very much useful...
Thanks & Regards,
Digital Marketing Course
.I really needed something like this. I lost my interest in online courses for this very reason.
ReplyDeletehardware and networking training in chennai
hardware and networking training in velachery
xamarin training in chennai
xamarin training in velachery
ios training in chennai
ios training in velachery
iot training in chennai
iot training in velachery
I enjoy what you guys are usually up too. This sort of clever work and coverage! Keep up the wonderful works guys I’ve added you guys to my blog roll.
ReplyDeletedata science training in chennai
data science training in omr
android training in chennai
android training in omr
devops training in chennai
devops training in omr
artificial intelligence training in chennai
artificial intelligence training in omr
The strategy you have posted on this technology helped me to get into the next level and had lot of information in it.
ReplyDeleteangular js training in chennai
angular js training in tambaram
full stack training in chennai
full stack training in tambaram
php training in chennai
php training in tambaram
photoshop training in chennai
photoshop training in tambaram
Very good information. Its very useful for me. We need learn from real time examples and for this we choose good training institute, we need to learn from experts
ReplyDeletehardware and networking training in chennai
hardware and networking training in annanagar
xamarin training in chennai
xamarin training in annanagar
ios training in chennai
ios training in annanagar
iot training in chennai
iot training in annanagar